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ABSTRACT 


Previous authors have postulated that faults are related to each other and 
testers have tried to exploit the effect. However, the evidence and applications 
have been largely anecdotal. This thesis uses an analytical derivation of software 
failure regions to develop a quantitative metric of the relationship of one fault to an- 
other. This metric is then applied in an empirical study of a population of failure re- 
gions derived from faults used in a previous experiment. The failure regions were 
analyzed for clustering behavior using graph theory techniques. The goal of this 
study is to be able to use information about known faults in a program as a means 
of finding other faults in the same program. This study provides strong evidence 
that failure regions have a tendency to form clusters. Further, two specific charac- 
teristics of failure regions that lead to cluster formation are identified: shared 
bounding conditions (the Identical dimension) and shared variables that appear in 
different contexts (the Coincidental dimension). The nature of the clusters formed 
by these two dimensions are markedly different. The Identical dimension clusters 
are small, isolated, and strongly connected. The Coincidental dimension clusters 
are larger and more loosely connected. Software testing implications of failure re- 
gion clustering behavior are discussed. 
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І. INTRODUCTION 


А. MOTIVATION FOR THIS STUDY 

On July 10, 1991 an article in the Wall Street Journal blamed major telephone 
outages on a software failure. Three incorrectly set flag bits resulted in the omis- 
sion of congestion control algorithms. DSC Communications Corp., the manufac- 
turer of the faulty signaling system, reported that 


... Pacific Bell ... had requested software changes involving perhaps three 
or four lines of code. Engineers decided that because the change was 
minor, the massive program didn't need to undergo the rigorous 13-week 
test that most software is put through before it is shipped to customers. 
(Wall Street Journal, 1991) 


Engineers were also unable to explain why the problems didn't appear until several 
weeks after installation, and then only in two of the five Bell companies where the 
revised software was installed. 

The decision by the engineers to forego testing because the changes were 
"minor" indicates a misunderstanding about how various parts of the program inter- 
act with each other. Their confusion about the delayed and selective appearances 
of the faults points to a lack of understanding about the conditions that had to be 
met for the fault to produce a failure. These problems clearly show the need for a 
method of projecting how changes will affect the performance of a program and 
how to deal with the conditions that cause those effects. 

As general-purpose computing systems perform more and more sophisticated 
functions, the software necessarily becomes larger and more complex. As soft- 
ware size and complexity grow linearly, software testing and debugging become 
exponentially harder. In fact, testing consumes as much as half of the budget for 
the development of most major software systems, while error correction and spec- 
ification revision account for up to 90% of software life-cycle costs after the soft- 
ware has been marketed (Alberts, 1976). 


Unfortunately, extensive software testing is frequently necessary in spite of its 
expense. Many computer applications require fault-free, or at least fault-tolerant, 
operation. Examples include aircraft control and medical systems. In applications 
such as these, computer failure may result in a disaster, such as the loss of life or 
capital equipment. Even for systems that are less critical, such as the telephone 
example described above, failure can cause a significant loss of time, money, or 
productivity. 

The need for reliable computers can only be expected to grow. This implies the 
need for reliable software since software failures are responsible for the majority of 
failures in computing systems that have fault tolerant hardware. Careful specifica- 
tion, design, and testing are the keys to producing reliable software. This thesis 
deals with the area of software testing. 

The following sections briefly outline the background for this study, the hypoth- 
esis of the study, and a description of the experiment that was used to test the 
hypothesis. 


B. BACKGROUND 

The ANSI/IEEE standard definition of a fault is an accidental condition that 
may cause a program to fail. Failure means that a program does not perform its 
required function. This may mean that the program does not execute or that it exe- 
cutes and produces an error. Àn error is a discrepancy between a computed value 
or condition and the true, specified or theoretically correct value or condition. (Glos- 
sary, 1983) 

A subset of the program domain (i.e., input space) is associated with every 
fault in a program. Sets of bounds delimit this subset, one set corresponding to 
each variable in the domain. These bounds identify the values of the program vari- 
ables that will result in program failure due to that specific fault. Every variable must 
be within its specified bounds before that fault will produce a failure. Ammann and 
Knight called the subset of the domain associated with a fault its failure region 
(Ammann and Knight, 1988). They determined failure regions empirically, by repet- 
itive probing, rather than analytically. 


Bolchoz described three conditions that are required for a fault to produce a 
failure. First, all the conditions for the fault to be executed must be met. Second, 
the fault must be executed in a way that produces an error. Finally, the error must 
be propagated to a final result without being masked by subsequent processing. 
The failure region of a fault is the subset of the program domain that allows the fault 
to satisfy all of these conditions simultaneously (Bolchoz, 1990). 


C. HYPOTHESIS 

Bolchoz’s study considered how to identify the failure regions of isolated faults. 
He did not consider relationships between faults. Elements of his analysis, how- 
ever, suggested that failure regions may exhibit a relationship that links faults to 
each other. Failure regions are derived directly from their associated faults. There- 
fore, a relationship between failure regions would imply a relationship between 
their associated faults. If such a relationship exists, then the failure region of a 
known fault may be useful in deriving information about other failure regions. This 
information may, in turn, may lead to the discovery of other faults. 

The primary goal of this research was to develop a technique for empirically 
examining failure regions to determine what relationships exist between failure 
regions. A secondary goal was to characterize the relationships. The hope is that 
these relationships may be useful in fault-detection applications. 

Some difficulties arose during the development of the analysis technique. The 
first was that there was no statistical information about the behavior of failure 
regions. Which features of failure regions should be used in characterizing their 
behavior? What type of distribution does their behavior exhibit? 

A second problem was the dimensionality of failure regions. A failure region 
has a separate dimension associated with each program variable. Failure regions 
for practical software can easily have several hundred dimensions. All of these 
dimensions are not necessarily orthogonal. 

A third difficulty was how to quantify similarities between failure regions. Ideas 
such as Euclidean distance have no meaning because of the heterogeneity of the 
failure regions. How can the bounds of the variables in two failure regions be used 


to measure their “closeness”? How can the similarities of one pair of failure 
regions be compared relatively to the similarities of another pair when the two pairs 
are completely dissimilar? 

In order to make the problem tractable, it was assumed that the failure regions 
would have Guassian distributed behavior. Additionally, it was assumed that all 
variables affected failure region behavior in the same way. This allowed relation- 
ships between failure regions to be identified by the number of variables their 
bounds had in common. 


D. DESCRIPTION OF THE EXPERIMENT 


The empirical data for this study come from a set of programs published in a 
previous study. Shimeall and Leveson wrote a functional specification for a combat 
simulation program. Eight pairs of undergraduate students independently wrote 
programs based on this specification. The eight programs were then extensively 
tested (Shimeall and Leveson, 1991). The failure regions for the Known faults in 
these programs have been identified using Bolchoz's method. 

The problem is analyzed with graph theory techniques. Failure regions are 
modeled as nodes in a series of graphs. The relationships between the failure 
regions are modeled as edges. Edge weights are developed based on how many 
variables two failure regions share as well as the context of the variables within the 
failure regions. The single-link clustering method is used to study how failure 
regions tend to form clusters (Jain and Dubes, 1988, p. 70). The clustering tenden- 
cies provide insight into which types of failure region-variable behaviors may pro- 
vide useful information for fault detection. 


E. OVERVIEW OF THE THESIS 


Chapter II gives a more extensive literature review of software testing in gen- 
eral and failure region analysis in particular. Chapter III describes how the data 
were converted into graphs and discusses the details of graph theory and cluster 
analysis that apply to this study. Chapter IV describes the methods of analysis and 
the results of the analyses. Finally, Chapter V summarizes the conclusions that can 
be drawn from the results and offers directions for further research. 


ll. BACKGROUND AND RELATED WORK 


This chapter reviews software testing definitions and methods for dealing with 
software failures. It then discusses theories of software testing, concentrating on 
models that are germane to this work. Next it presents previous work in the area of 
failure regions analysis. Finally, it reviews the basis for the cluster analysis 
techniques that are used in the experimental portion of this study. 


A. SOFTWARE TESTING 


1. Faults and Failures 

Software developers realized long ago that virtually all software is faulty. 
However, faulty programs do not always fail. While this may be fortunate from the 
standpoint of the user, it is troublesome from the standpoint of the tester. The 
telephone system example cited in the first chapter demonstrated that a program 
may run correctly for an indefinite period of time before it fails. It aleo showed that 
just because a fault goes unnoticed it does not mean that the failure will be 
insignificant. A great deal of money and productivity were no doubt forfeited by 
customers who suffered the loss of their telephone service. 

If software developers concede that their software contains faults and if 
they desire to ensure that those faults do not result in software failures, then the 
question is how to deal with the faults. There are two possible approaches: either 
they must find the faults and eliminate them or they must develop methods of 
tolerating the faults. This thesis deals with the fault-elimination approach. 


2. Software Fault Elimination 
The goal of software fault elimination is to find every fault in the software 
and remove it, thereby producing a fault-free program. There are numerous 
methods of fault elimination. The literature on these is extensive and will not be 
reviewed here. Myers (Myers, 1979) and Beizer (Beizer, 1990) both give excellent 


Surveys of these methods. The discussion here will concentrate on fault-based 


testing. 


a. Two Different Approaches to Software Testing 


Myers claims that since software contains faults and since the purpose 
of software testing is to eliminate faults, then the only successful test is one that 
finds a software fault (Myers, 1979, pp. 4-7). In other words, if the program runs 
correctly on a given test, then that test failed. This approach requires a somewhat 
destructive mentality; the tester is trying to break the program and he is 
disappointed if he cannot. Many software testers have subscribed to this theory. 

The difficulty with Myers' theory is that there is no clear criterion for 
termination of testing. Neither tests that succeed nor tests that fail under this theory 
provide any information about either the presence or the absence of other faults in 
the software. 

Morell offers a more constructive theory of testing (Morell, 1990). The 
difference in his approach is not so much in the tests that are run as in the 
information that can be gleaned from the tests. Under this theory, a test that fails 
by Myers' definition may still yield valuable information about which faults 
specifically cannot exist in the program. The advantage of this theory is that a 
criterion for completion of testing is available. The tester specifies the faults that he 
wishes to ensure are not present; he then tests to show that those faults are not 
present. The danger, of course, is that the tester may fail to specify faults that are, 
in fact, present in the software. 

Methods based on Myers' theory have primarily been concerned with 
establishing the necessary conditions for a fault to cause a failure. An example of 
these conditions would be all-statements coverage. However, in order to ensure 
that a fault causes a failure during testing, both the necessary and the sufficient 
conditions must be met. The necessary conditions only guarantee that a fault will 
be executed. The sufficient conditions, on the other hand, guarantee that if a fault 
is executed then it will produce a failure. This is where Morell's theory offers 


advancement over previous theories. The next section outlines a theory of test data 


selection that aims at being both necessary and sufficient. 


b. A Theory of Test Data Selection 


Goodenough and Gerhart first presented the idea of selecting test 
data that guarantee detection of faults. They called a test data set reliable if it 
uncovered a given fault consistently and valid if it was capable of detecting every 
errorin the program. They called a test set complete if it was both reliable and valid. 
They suggested using condition tables derived from the program specification for 
selecting test data. (Goodenough and Gerhart, 1975) 

Weyuker and Ostrand pointed out that while Goodenough and 
Gerhart's theory provided valuable insight on the properties that test data should 
have, it did not tell the tester how to find such data. In general, itis difficult to devise 
tests that meet Goodenough and Gerhart's definition of completeness. Weyuker 
and Ostrand suggested a more pragmatic goal for testing, namely, proving the 
absence of specified faults rather than all faults. They proposed to do this by using 
revealing subdomains. A revealing subdomain is a subset of a program’s input 
domain that contains only inputs that are guaranteed to reveal a fault. In other 
words, revealing subdomains provide the necessary and sufficient conditions for 
producing failures from specified faults. (Weyuker and Ostrand, 1980) 

Weyuker and Ostrand generated revealing subdomains бу 
intersecting two input domain partitions. The first partitioning was into sets that 
caused a specific path or family of related paths to be executed. They called these 
path domains. These partitions describe how the program actually treats the input 
domain. The second partitioning was based on program specifications, algorithms, 
and data structures. They called this the problem partition. These partitions 
describe how the program should treat the input domain based on the desired 
function of the program. The intersection of these two partitions produced sets that 
were characterized by the conjunction of the path conditions and the problem 
conditions. These are the sets they used for test data selection. Since ideally the 


two partitions should agree, intersections where they do not agree are probably 
fruitful places to search for failure producing inputs. (Weyuker and Ostrand, 1980) 

Richardson and Clarke proposed a method similar to Weyuker and 
Ostrand’s. They partitioned the input space into subdomains using information 
from both the program's specification and its implementation. They then proposed 
using symbolic execution to determine if the implementation agreed with the 
specification. (Richardson and Clarke, 1981) 

Richardson and Thompson developed the RELAY model of fault 
detection based on an earlier version of Morell's fault-based testing theory. A 
potential failure is originated when a fault is executed. This is the necessary 
condition for failure. The potential failure is then relayed through the program by 
computational and data flow transfers until it is manifested as an output error 
[failure]. The computational and data flow transfers are the sufficient conditions for 
failure. The failure must be both originated and relayed or it will not be revealed. 
Thus, this model provides a practical framework for selecting test data that are both 
necessary and sufficient for guaranteeing fault detection. (Richardson and 
Thompson, 1988) 


c. Mutation Testing 

The works of Morell and of Richardson and Thompson are adapted 
from mutation testing (DeMillo, et al., 1978). The idea of mutation testing is 
predicated on two assumptions. The first is the competent programmer 
assumption; itis assumed that the software is only "slightly" incorrect. For example, 
it is assumed that a numerical integration algorithm is not used in place of a 
differentiation algorithm. Although the assumption seems reasonable, it cannot be 
verified or for that matter even quantified. The second assumption of mutation 
testing is the coupling effect; that is, that tests that detect simple faults will also be 
sensitive to more complex faults. This effect is further discussed in the cluster 
analysis section below. 

The basic method of mutation testing is to try to identify the classes of 
faults that might exist in the software. Perhaps the designer indexed an array with 


the wrong loop counter or the programmer substituted a Boolean OR for a Boolean 
AND. Mutations of the program are generated from the identified classes of faults. 
Test data are then sought that will distinguish the mutations from the original 
orogram. Mutations that survive the testing are either functionally equivalent to the 
Original program or the test data are not sensitive enough to make the distinction. 


d. Partition Testing 

All the testing theories and methods that have been discussed here fall 
under the general category of partition testing. The primary characteristic of 
partition testing is that the program’s input domain is divided into subdomains. The 
tester builds his test set by selecting elements from each subdomain. Partition 
testing ranges from random testing to exhaustive testing. In the former, there is one 
partition, namely, the entire input space. In the latter, there are as many partitions 
as there are elements in the domain. Mutation testing is partition testing in that it 
divides the domain into partitions that distinguish the various mutants. 

Weyuker and Jeng examined partition testing strategies analytically. 
They showed that, in general, arbitrary partitioning strategies may provide results 
that are either better or worse than random strategies. (They used partitioned to 
mean more than one subdomain.) They also showed that if an appropriate method 
exists for refining partitions, then improvement of the performance of partitioning 
strategies over random strategies can be guaranteed. While Weyuker and Jeng 
present no specific strategy, their results suggest that refinement should be fault- 
based, i.e., that partitions should be designed with particular faults in mind. 
(Weyuker and Jeng, 1991) 

In summary, most testing strategies guess at the nature of the faults 
that might be present and then try to develop test sets to uncover the hypothesized 
faults. This might be characterized as an outside-to-inside approach. Little study 
has been done to determine how faults really behave. This study has the goal of 
determining actual fault characteristics that may be useful in locating faults. This 
might be termed more of an inside-to-outside approach to testing. 


3. Failure Regions 

A subset of the program domain is associated with every fault in a 
program. Sets of bounds delimit this subset, one set of bounds corresponding to 
each of the variables in the domain. These bounds identify the values that the 
program variables must assume in order for that specific fault to cause a program 
failure. Every variable must be within its specified bounds before that fault will 
produce a failure. Ammann and Knight called the subset of the domain associated 
with a fault its failure region (Ammann and Knight, 1988). 

Ammann and Knight used failure regions to develop an approach to 
software fault tolerance called data diversity. They suggested that for many 
program variables there is a set of values that will produce equivalent program 
behavior. If a fault produces a failure and if there is an equivalent value for the 
offending variable that lies outside the failure region, then failure can be avoided 
by substituting the equivalent value. Data diversity is a fault-tolerance technique 
rather than a fault-elimination technique. (Ammann and Knight, 1988) 

Bolchoz developed an analytical method for identifying failure regions 
(Bolchoz, 1990). He described three conditions that are required for a fault to 
produce a failure. First, all the conditions for the fault to be reached must be met, 
e.g. appropriate procedure calls and program branches. Second, the fault must be 
executed in a way that produces an error or an erroneous intermediate result. 
Finally, the error must be propagated to a final result without being masked by 
subsequent processing. The failure region of a fault is the set of data values that 
satisfy the conjunction of these three conditions. The difference between this 
method and that of Weyuker and Ostrand is that this method identifies conditions 
for execution of a specific fault that is already known to exist while their method 
identifies conditions for where hypothesized faults are likely to exist. Shimeall, et 
al, showed that, under certain assumptions, Bolchoz's method provides the 
necessary and sufficient conditions for a known fault to produce a failure (Shimeall, 
et al., 1991). 
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Voas and Morell explored an idea similar to failure region analysis. They 
called it propagation and infection analysis. They studied the sensitivity of 
programs to faults by executing the programs rather than by examining the 
program specification and implementation. They called the probability that a fault 
will be executed on a randomly selected input the execution rate. The probability 
that the fault will infect subsequent data states after the error occurs is the infection 
rate and the probability that the fault will persist to manifest a program failure is the 
propagation rate. They suggested empirical methods for estimating these rates. 
They used the conjunction of these individual rates to predict the program’s failure 
rate. (Voas and Morell, 1989) 

Failure regions have been used to provide insight into the necessary and 
sufficient conditions for revealing specific faults and for understanding how specific 
faults behave in isolation. The study presented in this thesis is the first to collect 
information on how faults or failure regions are related to each other. Failure 
regions offer a mechanism for identifying common features among faults. Faults 
that have similarities in their failure regions might be expected to exhibit similar 
behaviors when they cause a failure. This thesis explores the similarities and 
differences between the failure regions of known faults in the same program with 
the goal of better understanding fault behavior. 


B. CLUSTER ANALYSIS 


1. Definitions 

Much scientific study is based on the classification of objects according to 
perceived similarities. Cluster analysis is the study of how to build a formal basis 
for this activity of classification that humans perform almost instinctively. Although 
the idea of deciding when objects are similar to each other may seem intuitively 
Obvious, researchers have had difficulty in agreeing on a formal definition of a 
cluster. One definition that fairly well describes the analysis performed in this thesis 
is: “Clusters may be described as connected regions of a multi-dimensional space 
containing a relatively high density of points, separated from other such regions by 
a region containing a relatively low density of points.” (Jain and Dubes, 1988, p. 1) 
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2. The Basis for Using a Cluster Analysis Approach 

Myers cites anecdotal evidence that the probability of the existence of 
undiscovered faults in a given section of code is proportional to the number of faults 
already found in that section (Myers, 1979, p. 16). He calls this tendency error 
[fault] clustering. Myers is speaking specifically of the proximities of faults to each 
other in the code, e.g., two sequential statements. 

The coupling effect is an idea that is similar to fault clustering. Offutt 
conducted an empirical study of the validity of the coupling effect. He tested 
programs that contained automatically generated first-order mutants. He then used 
the same data sets to test programs that contained second-order mutants that 
were generated from the first-order mutants. His results offer convincing evidence 
that any test that is sensitive to “simple” faults will also detect more “complex” 
faults. (Offutt, 1989) 

Mutation testing uses the assumption that there are relationships between 
faults as a basis for the technique. However, the approach tries to find faults by 
random (or exhaustive) generation of mutants; this is a rather computationally 
intensive approach. This thesis explores the idea of identifying the specific 
relationships that cause fault clusters. Specifically, common features of failure 
regions from the same program are identified. Failure regions are directly linked to 
specific faults. Thus, Knowledge about these common features may raise the 
probability of predicting the locations of undiscovered faults based on their 
relationships to faults that have already been found. 


3. Cluster Analysis Techniques 


a. A Graph Theory Approach 
The discussion in this section derives from Godehardt's presentation 
of graphs as structural models and their use in cluster detection (Godehardt, 1988). 
The discussion is specific to failure regions modeled as nodes and relationships 
between the failure regions modeled as edges. It is assumed that the reader is 
familiar with the concepts of graph theory. Definitions of graph theory terms are 
presented for reference in Appendix A. 
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The connectivity (edge-connectivity) of a graph gives a qualitative idea 
of both the nature and the strength of relationships between failure regions. If 
bridges or cutnodes exist, it may be possible to find blocks in a graph whose 
connectivity (edge-connectivity) is large relative to that of the graph itself. Even if 
there are no bridges or cutnodes, a graph that is n-connected (n-edge connected) 
for small n, e.g. 2 to 4, may still contain significant subgraphs that have relatively 
larger connectivity (edge-connectivity.) Such blocks or subgraphs would suggest 
that there are groups of failure regions that are strongly related to each other but 
only weakly related to other failure regions. If the graph is disconnected then both 
the absence of relationships between failure regions in different components and 
the presence of relationships between failure regions within components is 
emphasized. 

The diameter, radius, and center of a subgraph indicate how intricately 
the failure regions are related. If the diameter is one or two then every pair of failure 
regions is either directly related or both regions in the pair are related to the same 
failure region. If the diameter is large but the radius is small, then the center of the 
graph is a subgraph that has relationships analogous to the supergraph with a 
small diameter. 

Relationships may also be modeled with a multigraph. Each graph in 
the multigraph has the same node set, but the edge sets are based on different 
criteria. In general, each graph in the multigraph has blocks containing different 
sets of nodes. If failure regions appear in two or more blocks across the multigraph, 
this might Suggest how two different clustering criteria were related to each other. 
These failure regions might also be important in characterizing variables that lead 
to certain faults. 


b. A Traditional Clustering Approach 
The goal of cluster analysis is to identify groups of objects that have 
similar characteristics. Most traditional clustering algorithms (as opposed to the 
graph theory methods described above) work on some variation of the following 


method: 
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1. For the object that is to be placed in a cluster, find the single object that is 
“closest” to the object of interest, and put those two objects in the same clus- 
ter. 


2. If there is no "close" object, then start a new cluster. 
In other words, the clustering is essentially based on an object's relationship to its 
closest neighbor. 

There are two basic classifications of clustering techniques that follow 
this algorithm: partitioned and hierarchical. Partitioned clusters require every object 
to be in exactly one cluster. The researcher must decide a priori at what distance 
an object is too far away from its neighbors to be included in their cluster. This 
approach assumes that objects in different clusters are completely dissimilar. 

The hierarchical approach assumes that if the restrictions for 
comparison are relaxed sufficiently (e.g., to no restriction at all), then no two 
objects are absolutely dissimilar. This method starts by forming clusters with strict 
criteria and then allows the clusters to merge as the criteria are relaxed. When the 
clustering criteria have been relaxed sufficiently, all the objects will form one 
cluster. 

One difficulty in applying these methods to the current problem is in 
determining when two failure regions are close to each other. The sample space is 
heterogeneous and the relationships between the failure regions are ordinal. Both 
of these factors make the idea of Euclidean distance meaningless. Some other 
measure of "distance" between failure regions is required. The approach used in 
this study is described in detail in the next chapter. 

The clustering method used in this study is a hierarchical method 
called single-link clustering. The method uses a threshold graph to construct the 


clusters. This method is also described more fully in the next chapter. 


C. CONCLUSION 


This chapter has described the background needed to support this study of 
failure region analysis. A reliable method for finding faults in software needs to be 
developed. An important element of a reliable testing method is its ability to estab- 
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lish both the necessary and the sufficient conditions for a fault to be revealed. Fail- 
ure regions developed using Bolchoz's method have been shown to establish 
these conditions for known faults. If relationships between failure regions can be 
characterized, then the failure regions of detected faults may provide information 
about where to find still more faults. One step towards characterizing these rela- 
tionships is to determine the clustering tendencies of failure regions. The next 
chapter describes the experiment used for studying failure region clusters. 
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Ill. EXPERIMENTAL DESIGN 


A. INTRODUCTION 

This study analyzed similarities and differences between failure regions. The 
primary goal of this research was to develop a technique for empirically examining 
failure regions to determine what relationships exist. A secondary goal was to char- 
acterize the relationships. A set of programs written to the same specification were 
taken from a previous study (Shimeall and Leveson, 1991). The faults in these pro- 
grams provided the failure regions used in this study. Patterns of variable usage 
were identified in these failure regions. Graphs based on this analysis used nodes 
to represent failure regions and edges to represent relationships based on the con- 
text and frequency of variable usage. Clustering patterns and tendencies among 
the failure regions were identified from these graphs. 

This chapter describes the data that were used for the study and how the data 
were reduced to a form useful for analysis. The methods of generating the graphs, 
including the edge weights, are presented. Finally, cluster analysis techniques are 
discussed. 


B. DESCRIPTION OF THE DATA 

Shimeall and his students are using a set of eight programs in an ongoing se- 
ries of software testing studies. Shimeall wrote a functional specification for a com- 
bat simulation program. Eight pairs of undergraduate students separately wrote 
programs based on this specification. Shimeall then extensively tested the pro- 
grams using code reading, assertions, testing, and voting. The numbers of known 
faults in each of the various programs range from as few as 25 to as many as 50. 
(Shimeall and Leveson, 1991). 

Bolchoz developed a method for determining the failure region of a fault based 
on the conditions that must be met for that fault to cause a failure (Bolchoz, 1990). 
Shimeall used Bolchoz's method to generate the failure regions of the faults in four 
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of the eight programs. Appendix B contains the failure region definitions for Version 
1 of the program as an example. The complete set of failure region definitions is 
contained in a separate report (Shimeall, 1991). Table 3.1 gives a profile of the 
faults and failure regions by program version. The term dimensions is used to refer 
to either input variables or to predicates composed of input variables. The numbers 
of input variables exceed the numbers of dimensions because there are several in- 
put variables that appear only in the variable predicates. These predicates are dis- 


cussed further in the next section. 


TABLE 3.1: PROFILE OF FAILURE REGIONS 


Versions 


Known faults 
Noncoincident regions 
Total dimensions 


Mean dimensions per region 


Std. dev. dimensions per region 


Total input variables 
Mean input variables per region 


Std. dev. input variables per 
region 





C. DATA REDUCTION 

The first step in developing a strategy for exploiting relationships between fail- 
ure regions was to determine how to identify the relationship. Failure regions are 
defined by bounds on the various program variables. This suggested that the rela- 
tionships sought in this study might also be described in terms of these variable 


bounds. Variables may be considered according to their syntax or their semantics. 
Syntax deals with whether the variable is used legally within the constraints of the 
language and the program. Semantics deals the meaning of the variable in a spe- 
cific context. This study considered both syntax and semantics. 


1. The Use Of Predicates 


As Shimeall derived the failure regions, he noted that some variables were 
used under commonly occurring conditions. The conditions were frequently related 
to semantic contexts in the program specification. When these conditions were 
noted, predicates were substituted for individual variables in order to identify the 
semantic context of the failure region. 

Predicates were treated in the same way as individual variables during the 
analysis. There were two reasons for choosing this approach. The first was that 
even though many of the same variables participate in the various predicates, the 
predicates are semantically different. Preserving the semantic contexts of these 
sets of variables within their respective failure regions helps to clarify the 
relationships between the failure regions. 

The second reason for using the predicates rather than their component 
variables was that most of the predicates involved numerous variables. Edges in 
the graphs were determined by how many variables two failure regions' bounds 
had in common. Since at least one predicate occurred in most of the failure 
regions, using only the individual variables could have resulted in complete, or 
nearly complete, graphs. This might have obscured interesting results. 

The problem with leaving the predicates intact was that the predicates are 
essentially semantic. On the other hand, individual variable incidence is primarily 
syntactic. This mixing of semantic and syntactic forms in the same analysis could 
lead to some distortion, especially since the decision of when to condense a set of 
bounds into a predicate was somewhat arbitrary. 

Thus, while it was recognized that some distortion would probably result 
from either treatment of the predicates, it seemed that treating them in the same 


manner as individual variables was more likely to filter some of the noise out of the 
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graphs and draw more attention to the useful differences and similarities of the 
failure regions. Hereinafter, variables and predicates will be referred to collectively 
as failure region dimensions. 


2. Analysis of Dimension Participation in Failure Region Bounds 
Each pair of failure regions within a given version was compared. For each 
pair, each dimension was classified as participating in one of the following ways: 


1. The dimension appeared in both regions’ bounds in exactly the same way. 
For example, in failure regions 1.3 and 1.4, Params .NumWEvent s partici- 
pates in the bounds as an index to the same dimension (see Appendix B). 
This type of participation was termed Identical. 


2. The dimension appeared in both regions’ bounds but was not Identical. This 
type of participation was termed Coincidental. 


3. The dimension did not participate in the bounds of either of the regions in 
the pair. This type of participation was termed Nonbounding. What this type 
of participation really means is that the bounds that this dimension place on 
the failure region are no more restrictive than the entire range of values that 
this dimension can assume. 


The Identical and Coincidental dimensions are referred to collectively as the Com- 
posite dimension. 

(Initially, an attempt was made to identify dimensions that had similar 
behavior between two failure regions. For example, if the same dimension 
participated in an inequality in both failure regions but the inequalities were not 
Identical, this might have been considered Similar. However, subsequent analysis 
showed that the Similar dimension offered no useful insight and Similar was 
discarded as a separate dimension classification.) 

Dimensions that were Nonbounding for all failure regions in a given 
version were discarded from that version’s matrix. This significantly reduced the 
size of the matrices since there were 127 variables, besides the predicates, that 
could potentially participate in the bounds. The various versions studied here 
actually use from 48 to 53 dimensions in the bounds of their failure regions. 
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3. The Failure Region-Dimension Incidence Matrices 


The results of the dimension evaluations were placed into incidence 
matrices with dimensions on the rows and failure regions on the columns. These 
matrices are in Appendix C. 

In the matrix for Version 1, the entry in column 1.10 Юг 
Army[].Squadrons is 110. This entry indicates that the participation of 
Army [] . Squadrons in failure region 1.10 is Identical to itself and is not Identical 
to its participation in any of the first nine failure regions. Coincidental behavior 
between two failure regions is indicated if they both have an entry, but they are not 
Identical to the same failure region. A blank entry in the matrix indicates that the 
given dimension is Nonbounding for the given failure region. 

Each entry in the matrix is referenced to the lowest numbered failure 
region to which that dimension is Identical. As an example, both columns 1.10 and 
1.11 contain the entry 1 for the dimension NArmy[]. This means that the 
participation of NArmy[] in both of these regions is Identical to that in region 1.1. 
This is clearly a transitive property, so the participation of NArmy[] in region 1.10 
may be inferred to be Identical to its participation in region 1.11. 

Both the failure regions' definitions and the failure region-dimension 
incidence matrices were generated manually. Because of this, some errors have 
undoubtedly been made. However, the numbers of distinct failure regions in the 
various versions used in this study range from 21 to 37. Thus the smallest graph 
could have as many as 210 edges while the largest could have as many as 666. If 
the errors are few, the affect on the validity of the qualitative results should not be 
significant. 


4. Failure Region Graphs 
Graphs were generated from the failure region-dimension incidence 
matrices for each version of the program. Each failure region was treated as a 
node. Weighted edges between the nodes were based on the numbers of 
dimensions the failure regions had in common as well as how those dimensions 
participated in their bounds. 
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The edge weights were calculated using the program listed in Appendix D. 
This program takes the failure region-dimension incidence matrix as an input. It 
identifies the value associated with each failure region-dimension pair, i.e. | or 
blank. It also identifies the associated failure region number, i.e. the number 
following the |. The program stores these values in an array indexed by dimension 
and failure region numbers. 

The program uses the failure region-dimension array to count the number 
of occurrences of Identical and Coincidental dimensions for each pair of failure 
regions. It also counts the total number of dimensions that appear in the bounds of 
at least one of the failure regions in that pair. The program calculates the edge 
weighting coefficients from these counts. (These coefficients are described in 
subsection 5 below.) Finally, the program lists: 

1. the edges, in descending order of their coefficients, 


2. the coefficient and the dimension counts associated with each edge (i.e. 
and coefficient numerator and denominator), and 


3. the nodes, in order based on their largest incident edge. 


These graphs are presented in tabular form in Appendix E. 
5. Determination Of Edge Weights 


a. Separate Analysis of Identical and Coincidental Data 


The data for this study are essentially ordinal, namely, in descending 
order: Identical, Coincidental, Nonbounding. Relative values cannot be assigned to 
data that are inherently ordinal; thus, there is no way to develop a single edge 
weight that accurately represents the relationship between two failure regions. 
Because of this, three separate graphs were developed for each version of the 
program. 

The first graph considered only Identical bounds. The second graph 
considered only Coincidental bounds. The third graph lumped the Identical and 
Coincidental dimensions together to form the Composite dimension. This third 
graph was developed to test whether splitting the dimension behaviors into 
Identical and Coincidental had produced any artificial affects. This, then, resulted 
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in three sets of binary data: Identical or not, Coincidental or not, and Composite or 


not. 


b. Selection of the Weighting Coefficient 
Two different coefficients were considered for determining the values 
of the edge weights. Both coefficients give an indication of how closely related two 
failure regions are. The first was the simple matching coefficient, given in Equation 
3.1 (Jain and Dubes, 1988, p. 17). The numerator of this coefficient is the sum of 
the number of dimensions that are Composite for both regions and the number of 
dimensions that are Nonbounding for both regions. The denominator of the simple 


matching coefficient is the total number of dimensions. 
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where: 

S(m,n) - simple matching coefficient for regions m and n. 

agg - number of dimensions that are Nonbounding for m and n. 
арт - number of dimensions that are Composite for m but not n. 
ао - number of dimensions that are Composite for n but not m. 
ау - number of dimensions that are Composite for m and n. 


The simple matching coefficient assigns as much importance to 
Nonbounding dimensions as it does to Composite dimensions. The 
nonparticipation of a given dimension in a failure region simply means that the fault 
can result in a failure regardless of the value of that dimension. The primary goal 
of this study was to determine if failure regions can be used to identify dimensions 
of interest for software testing. Therefore, it is not particularly useful to know that 
the value of a dimension is irrelevant when the fault causes a failure. For the 
purposes of this study, the participation of a dimension in the bounds of a failure 
region is more significant than the nonparticipation of a dimension. 

The second coefficient considered was the Jaccard coefficient, J(m,n), 
given in Equation 3.2 (Jain and Dubes, 1988, p. 17). The aj, for this coefficient have 
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the same meaning as those for the Simple matching coefficient. The numerator of 
this coefficient is the number of dimensions that are Composite for both failure 
regions. The denominator is the number of dimensions that are Composite for at 
least one of the regions. This coefficient places a heavier emphasis on dimension 
participation than on nonparticipation. The Jaccard coefficient that was used in this 
study. 


а 
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The Jaccard coefficient had to be modified to analyze the Identical and 
Coincidental data individually. The reason is that for the numerator, the condition 
to be satisfied is not just Composite but specifically Identical or Coincidental. In 
other words, for the graph of Identical values, the numerator of the coefficient is 


only the number of dimensions that are Identical between the two regions, as is 
shown in Equation 3.3. The Coincidental data are treated similarly in Equation 3.4. 


ШЕШЕ E ad LL (Eq 3.3) 


where: 
l(m,n) - modified Jaccard coefficient for Identical dimensions 
41 - number of dimensions that are Identical in regions m and n 


C(m,n) = "aj x (Eq 3.4) 
| 444 * dg, t dip 


where: 


C(m,n) - modified Jaccard coefficient for Coincidental dimensions 
C41 - number of dimensions that are Coincidental in regions m and n 
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D. CLUSTER ANALYSIS 


1. Clustering Method 

Two methods were considered for use in identifying failure region clusters. 
The first method was to look for k-connected subgraphs (Godeharat, 1988). This 
method requires searching for all possible paths between every pair of nodes in the 
graph. This is an NP complete problem. Additionally, in a weighted graph, the 
problem must be solved for each threshold value of interest. 

K-connected subgraphs were not used for two reasons. The first was that 
the method is too detailed for exploratory analysis. It is more suited to identifying 
specific clusters in data where the clustering behavior is already well understood, 
i.e., where the range of k is fairly well estimable. Tne second reason for not using 
this method was its computational complexity. Again, this inhibits exploratory 
analysis. 

The second method was adapted from Jain and Dubes (Jain and Dubes, 
1988, p. 70). This method, called Single-Link Clustering, is also based in graph 
theory but follows more closely the traditional ideas of cluster analysis. Clusters are 
developed by adding edges to the graph in the order of their relative weights. As 
the weight threshold becomes less restrictive, more edges are added to the graph. 

The addition of a new edge to a graph can have one of two results. The 
first is that the edge may connect two nodes that were already connected by edges 
at more restrictive weight thresholds. Edges such as these have the effect of 
strengthening existing clusters. The other result a new edge may have is to merge 
two components in the graph. If one of these components has multiple nodes, that 
edge has increased the size of a cluster. If both components are singleton nodes, 
the edge has initiated a new cluster. (While a singleton node is technically a one 
element cluster, the discussion here uses cluster to mean a grouping of two or 
more nodes.) 

The modification to the Single-Link method as described in Jain and 
Dubes was that the requirement that no two edges have the same weight was 
relaxed. This modification was reasonable since the goal of this study was not to 
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identify specific failure regions in specific clusters; nor was it the goal to identify the 
Specific order in which failure regions were added to clusters. Rather, the goal was 
to identify whether there was even a tendency for failure regions to cluster in a way 
that was useful for developing software testing strategies. 

One note should be made regarding the use of the Jaccard coefficient in 
conjunction with this cluster analysis method. Most clustering methods assume 
that a smaller edge weight indicates nodes that are more similar to each other, i.e. 
more strongly clustered. This idea comes from the fact that edge weights are 
frequently derived from Euclidean proximities. For edge weights based on the 
Jaccard coefficient, however, the closer the coefficient is to one, the more alike the 
failure regions are. (The range of the coefficient is from zero to unity.) This does 
not invalidate the clustering method; it merely means that edges are added to the 
graph by lowering the threshold rather than by raising it. 


2. Hierarchical Vs. Partitioned Clusters 


The clustering method used in this study produces an hierarchical 
clustering rather than a partitioned one. This is the type of clustering that was 
desired since it was not clear that failure regions should necessarily belong to 
exactly one cluster. Indeed, since the goal of this study was to determine if 
knowledge about one failure region can be used to find other failure regions, 
hierarchical clustering is more desirable than partitioned clustering. 

If it is the case that failure regions have a strong hierarchical clustering 
tendency, then at least two different ways of exploiting the clusters are suggested. 
First, the stronger (i.e. more restrictive threshold) clusters may provide a method 
to find the other failure regions within those clusters. Second, the potential exists 
to “bootstrap” from one strong cluster to another under the right conditions. This 
would involve identifying the types of dimension participations that result in the 


edges that appear at the less restrictive threshold values. 


E. CONCLUSION 


This chapter has detailed the procedures followed in analyzing the data used 
for this study. The known faults in four versions of the same program were used to 
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develop the failure regions for those faults. The failure regions were analyzed for 
Identical and Coincidental dimension behavior. The frequency of these types of 
behavior was then used to develop weighted graphs. These weighted graphs pro- 
vide a means for evaluating the tendency of failure regions to form clusters. The 


analysis of these clusters is the subject of the next chapter. 
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IV. EMPIRICAL RESULTS 


A. INTRODUCTION 

This chapter presents the results of the experiment discussed in the previous 
chapter. Before proceeding, however, some caveats should be noted. First, stu- 
dent programers produced the software used for this study. While these students 
may have had significant experience in programming, they cannot, in general, be 
classed with professional programmers. Fault populations produced by profession- 
al programmers may vary significantly from those of student programmers. Addi- 
tionally, the programs were all for the same application, namely, a battle simulation. 
Different types of applications may also produce significantly different distributions 
of faults. 

There are also some limitations that result from the experimental design. Only 
one method of quantifying the relationship between two failure regions was stud- 
ied, namely, a modified Jaccard coefficient. Additionally, only threshold graphs 
were used for cluster analysis. The narrow focus of the design may impose an ar- 
tificial structure on the data. Using only one analysis method may also obscure im- 
portant features of the data or highlight insignificant features. 

With these limitations in mind, and realizing that extensibility of the results be- 
yond this one application has yet to be established, the results still provide useful 
insight into how faults are related to each other. The next section describes how 
the data are presented. After that, notable characteristics of the data and the valid- 
ity of these characteristics are discussed. Finally, the results are interpreted with a 
view towards software testing applications. 


B. DATA PRESENTATION 

Dendograms are the typical method of presenting data for hierarchical cluster 
analysis. However, the goal of the cluster analysis in this study was not to identify 
specific failure region clusters in specific programs. Rather, the goal was to deter- 
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mine whether failure regions even have a tendency to form clusters. For this rea- 
son, histograms were used instead of dendograms. The advantage of histograms 
is that they are easier to use in comparing the behavior of several populations. 
Dendograms are more useful for analyzing a single population. 

For each program version, two histograms were constructed for each dimen- 
sion type. The first histogram shows how many edges are added to the graph in 
each interval of the Jaccard coefficient. In the second histogram, the column in 
each Jaccard coefficient interval shows how many nodes have their largest inci- 
dent edge in that interval. These histograms are presented in Appendix F. 

The first histogram presents additional information. The total column height in 
each interval shows the number of edges that have weights in that interval. The col- 
umn is divided into two parts. The black part, labeled “Between Newly Connected 
Nodes,” shows the numbers of edges that are incident on nodes that had no inci- 
dent edge in a higher threshold interval. This information corresponds directly to 
the numbers of nodes shown in the second histogram. The gray part, labeled “Be- 
tween Previously Connected Nodes,” shows the numbers of edges that are inci- 
dent on nodes that did have an incident edge in a higher threshold interval. 

The edges were divided into "Between Newly Connected Nodes" and “Be- 
tween Previously Connected Nodes” to help clarify the types of clustering behavior 
that the failure regions were exhibiting. The former category helps determine the 
numbers of edges involved in merging pairs of singleton nodes into new clusters 
or adding singleton nodes to a cluster. The latter category helps determine when 
previously defined clusters are being strengthened or are merging. While "Between 
Previously Connected Nodes" does not distinguish between edges added within a 
cluster and edges added between clusters, this is not important because it would 
not provide additional information about whether failure regions tend to form clus- 
ters, which is the primary goal of the cluster analysis. Although the strength and 
size of clusters would be important in practical software testing applications of fail- 
ure region clusters, the more important questions for this study are: how many 
nodes are in some cluster and are the nodes added to the cluster at a statistically 
significant threshold level? 
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The abscissae of the histograms are labeled with the Jaccard coefficient de- 
creasing from left to right. The reason for this convention is that cluster data are 
typically presented so that the more significant edges are to the left in the histo- 
gram or dendogram. This requires the largest value of the Jaccard coefficient to be 
presented at the left. 

The histograms are divided into intervals of 0.05. In general, the data included 
in an interval are strictly less than the upper limit of the interval and greater than or 
equal to the lower limit. There are two exceptions: data in the uppermost interval 
are less than or equal to unity; data in the lowermost interval are strictly greater 
than zero. The reason for the first exception is obvious. The reason for the second 
exception is that edges of zero weight represent the absence of a relationship be- 
tween two failure regions while nonzero edges represent the presence of some re- 
lationship, however weak. Inclusion of the zero weight edges might have skewed 
the histograms and lead to false conclusions about failure region clustering tenden- 
cies. 

In several cases, two or more distinct faults shared identical failure regions. 
When this occurred, the failure region was considered only once in constructing the 
histograms and the graphs. The reason is that if faults share identical failure re- 
gions, any test that reveals one of the faults will reveal all of them. The goal of this 
study is to find a method to reveal new failure regions rather than redundant ones. 


C. DATA ANALYSIS 


1. Notable Characteristics 

Analysis of the histograms suggests that there is indeed a tendency for 
failure regions to form clusters. For the identical dimensions, all four versions' 
histograms exhibit small groups at relatively large thresholds. These groups 
correspond to several small and unconnected clusters being formed. Over half of 
the nodes in the graphs have at least one incident edge in these higher threshold 
intervals. This is as opposed to many edges being added between just a few 
nodes. 
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The behaviors of the Coincidental and Composite dimensions are broadly 
similar to that of the Identical dimensions. However, there appears to be a 
difference in how the clusters grow. (This is discussed further in Section D.) 
Additionally, there seems to be more variation in the behavior between the versions 
for Coincidental dimensions as opposed to Identical dimensions. It is difficult to 
judge whether there are, in fact, significant differences here since there are only 
four versions to compare. 


2. Data Validation 

In order to verify that the noted characteristics could not be attributed to 
random behavior or to the experimental method, the experimental data were 
compared with a random population of regions. The null hypothesis to be tested by 
this comparison was: there is no difference in behavior between the experimental 
population of failure regions and a population of regions bounded by arbitrarily 
selected conditions occurring in the source code. Rejection of the null hypothesis 
indicates that clustering a behavior of the faults rather than the application studied 
or the analysis technique employed. 

Failure regions are bounded by conditions that either arise directly from 
the program source code or are synthesized from the source code. The random 
regions were bounded by conditions that were randomly extracted from the Gold 
version of the program in Shimeall and Leveson’s study (Shimeall and Leveson, 
1991). The Gold version was used to ensure that the random regions were not 
biased in favor of one of the test versions. The conditions were selected from a text 
file using the UNIX library function random. The distribution of the numbers of 
conditions in the random regions was selected to reflect the number of dimensions 
in the experimental failure regions. 

Two populations of random regions were used in order to match the sizes 
of the experimental populations. A 20 region set was used to approximate Versions 
1, 2, and 4; a 40 region set was used to approximate Version 3. The 20 region set 
was a subset of the 40 region set. These sets are referred to as R20 and R40, 
respectively. A statistical profile of the random regions is given in Table 4.1. The 
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mean number of input variables in the random regions is smaller than for the 
experimental regions since the random regions contain no predicates. If the 
predicates in the experimental regions are not expanded, the mean number of 
dimensions of the experimental regions is similar to the mean number of variables 


in the random regions. 


TABLE 4.1: PROFILE OF RANDOM FAILURE REGIONS 


Minimum/Maximum variables in a region 
Total input variables 

Mean input variables per region 

Std. dev. input variables per region 


Mean input conditions per region 


Std. dev. input conditions per region 





The random regions were treated with the same experimental procedure as 
the experimental regions. One way analysis of variance (ANOVA) was applied to 
the four experimental versions and the two random versions for both the edge and 
node distributions and for each type of dimension. The actual edge weights (as op- 
posed to the histogram distributions) were used in this analysis. Computations 
were performed with the UNIX/STAT data analysis program oneway (Perlman, 
1986). The results of the analysis are presented in Tables 4.2 through 4.7. The col- 
umn headings are self explanatory except for the last two; P(R20) and P(R40) are 
the probabilities that the given experimental distribution is the same as the random 
distribution. These probabilities are based on a Student t test. 

The results of ANOVA indicate that the null hypothesis can be rejected. The 
experimental edge distributions differ from the random edge distributions at better 
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TABLE 4.2: IDENTICAL DIMENSION EDGE STATISTICS 
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TABLE 4.3: COINCIDENTAL DIMENSION EDGE STATISTICS 
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TABLE 4.5: IDENTICAL DIMENSION NODE STATISTICS 
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TABLE 4.6: COINCIDENTAL DIMENSION NODE STATISTICS 
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TABLE 4.7: COMPOSITE DIMENSION NODE STATISTICS 
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than a 99 percent confidence level with the exception of Version 4’s Identical di- 
mensions, which have a better than 90 percent confidence level. Given the explor- 
atory nature of this study and the lack of prior information about the experimental 
population, a 90 percent confidence level is generally considered acceptable. 
Thus, If P(R) > 0.1 the experimental! data were not considered significantly different 
from the random regions. 

The node distributions had a wider range of variation, but most of the experi- 
mental distributions differed from the random distributions with greater than 90 per- 
cent confidence. There were some notable exceptions. For the Identical dimen- 
sion, Version 2 and Version 4 were not significantly different from R40. For the Co- 
incidental dimension, Version 2 did not vary significantly from R40. Finally, for the 
Composite dimension, Version 2 did not vary significantly from R40. 

While Version 3 is clearly different from the random distributions in all cases, 
there seems to be a contrast between the node distributions of Versions 1 and 2 
and Version 4. Version 4 was significantly weaker than either Version 1 or 2 for the 
Identical dimension while it was significantly stronger for the Coincidental dimen- 
sion. There is insufficient data to determine whether more random behavior in one 


dimension leads to less random behavior in another dimension. 


3. Cluster Formation 

The general shapes of the experimental data histograms are slightly but 
significantly different from the random data histograms. Specifically, the small 
groups of edges and nodes at higher coefficient thresholds are absent in the 
random distributions. However, the experimental distributions appear to be 
overtaken by random behavior below thresholds of about 0.1 to 0.3, depending on 
the dimension type. 

The primary usefulness of the histograms has been twofold. First, they 
have established that there is a statistically significant tendency for failure regions 
to form clusters. Second, they have provided an indication of which edges in the 
graph are, in fact, statistically significant. The shortcoming of the histograms is that 


34 


they do not show exactly how clusters are being formed. The graphs must actually 
be constructed for this purpose. 

Graphs were constructed using only statistically significant edges. 
Examples of these graphs are given in Figures 4.1 and 4.2. Most of the graphs are 
too large to be presented in a graphical format. Complete listings of the edges are 
presented in Appendix E. The numbering of the nodes is derived from the order in 
which the faults were discovered in Shimeall and Leveson’s study (Shimeall and 
Leveson, 1991). A complete listing of the numbered faults and their associated 
failure regions is given in the library of failure regions (Shimeall, 1991). 

The Identical dimensions of Versions 1 and 2 displayed behavior similar 
to that shown in Figure 4.1 for Version 3. Clusters (subgraphs) of two to nine nodes 
formed with many of the clusters containing components that were complete on 
three to six nodes. Version 4, on the other hand, had only three two-node clusters 
above the random level. It is notable that Version 4 displayed the least overall 
variance from the random regions for the Identical dimension. 

The graphs for the Coincidental dimension tended to be formed in a 
different fashion. As the example in Figure 4.2 shows, there are several subgraphs 
that are complete on three or four nodes. However, the clusters are not as clearly 
separated in the Coincidental dimension as they are in the Identical dimension. 

Graphs for the Composite dimension were not constructed since the 
Original purpose of this dimension was simply to ensure that division into Identical 
and Coincidental did not impose an artificial structure on the data. Review of the 
table of edges in Appendix E suggests that graphs constructed from the Composite 
dimension would behave similarly to those of the Coincidental dimension. 

The differences in the way the clusters join as the threshold is relaxed 
suggests that the hierarchical clustering theory may have been appropriate for the 
Coincidental dimension but that the Identical clustering might, in fact, be better 
modeled as partitioned. 
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There are thirteen 

TUN 1) v nodes in the graph 
that remain uncon- 
/ nected to any other 


node. These are not 


(97 (45) represented here. 


Figure 4.2: Version 1 Coincidental Clusters (coefficient in classes > 0.25) 


О. DATA INTERPRETATION 


Two different clustering behaviors have been noted for the experimental data. 
The difference in behavior seems to be driven primarily by dimension type. This 
suggests that failure region clusters may support two different methods of software 
testing. 

The first type of clustering is essentially partitioned, as displayed by the Iden- 
tical dimensions. This type of clustering would support a testing method that first 
broadly examines the software, for example all-branches structural testing. Failure 
regions of faults found by the initial method can then be used to search for other 
faults in the cluster. Since for partitioned clusters every fault is in exactly one clus- 
ter, it would be necessary to find a set of faults that covers several clusters with the 
initial testing method in order for failure region analysis to be a successful follow- 
on approach. 

The hierarchical clustering exhibited by the Coincidental dimension is more 
suggestive of an iterative approach to testing. At least one fault must still be found 
by some other method but it may then be possible to iteratively analyze failure re- 
gions and find more faults. 

The information available to the tester from failure region analysis is more spe- 
cific than just which variables should be considered in constructing a test set, es- 
pecially for the Identical dimension. Failure region analysis gives the tester the spe- 
cific conditions that resulted in faults. This study has shown that he may reasonably 
expect these same conditions to appear in the failure region bounds of other faults. 

Finally, it should be noted that failure region cluster analysis cannot guarantee 
that every fault will be located. The primary reason for this is that not every fault will 
be in a cluster or, more correctly, that some faults may be in singleton clusters. 
Several of the graphs generated in this study contained singletons. These faults 
will have to be discovered by some other method. 


E. CONCLUSION 


This chapter has presented the results of this experiment and their validity. It 
has also suggested ways these results may be applied in developing software test- 
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ing strategies. The last chapter summarizes the findings of this thesis and discuss- 
es how these findings support or contrast with previous findings. Directions for fur- 
ther research are also suggested there. 
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У. CONCLUSIONS AND SUGGESTIONS FOR FURTHER 
RESEARCH 


Previous authors have postulated that faults are related to each other and 
testers have tried to exploit the effect. However, the evidence and applications 
have been largely anecdotal. This thesis is the first work that has empirically 
analyzed the relationships between specific faults by using failure regions. The 
results of this thesis not only support the existence of such relationships, they 
Suggest methods for explicit rather than just implicit exploitation of them. This 
chapter Summarizes these results, discusses them in the light of previous work, 
and describes directions for future research that are suggested by this thesis. 


A. CONCLUSIONS 


This thesis offers strong evidence that failure regions tend to form clusters. 
The usefulness of this clustering behavior is that known faults in a program can be 
analyzed to produce their failure regions. Those failure regions then provide infor- 
mation about variables and conditions that are likely to be involved in other failure 
regions. This, in turn, Suggests to the tester areas that will probably be fruitful in his 
search for other faults. 

Failure region clustering was observed based on two distinct criteria: shared 
bounding conditions (the Identical dimension) and shared variables that appear in 
different contexts (the Coincidental dimension). The nature of the cluster formation 
for the two dimensions, however, was markedly different. The Identical dimension 
tended to produce small, isolated, strongly connected clusters. The nodes in these 
clusters were defined at relatively high thresholds and then the clusters became 
more strongly connected as the edge weight threshold was lowered. On the other 
hand, the Coincidental dimension tended to form larger, less strongly connected 
clusters. The clusters that formed at higher thresholds tended to merge into one or 
two larger clusters as the edge weight threshold was relaxed. There was no strong- 
ly identifiable pattern to the Coincidental cluster formation. 
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В. RELATIONSHIP OF RESULTS TO PREVIOUS WORK 


The results of this thesis generally support the findings of previous researchers 
in the area of relationships between faults. This agreement gives some confidence 
that these results may extend to more general software applications. The results 
also offer some amplification to previous studies. 

Offutt offered convincing empirical evidence that the coupling effect existed, 
but his study provided no explanation of the basis for it (Offutt, 1989). This thesis 
makes the first step toward identifying the specific behaviors of faults that result in 
this effect. The Identical dimensions considered in this study closely resemble the 
idea behind mutation testing: change one condition at a time and run a new test. 
Thus, the small, strongly connected, isolated clusters that were formed in the Iden- 
tical dimension graphs provide an explanation of why multiple-mutation testing 
does not fair significantly better than single-mutation testing. Every fault in the clus- 
ter has a short path to most other faults in the cluster. 

The Identical dimension results also seem to support Hamlet and Taylor's 
analysis that partitioned testing is a good debugging method but a poor technique 
for release testing (Hamlet and Taylor, 1988). If one fault in a cluster is known, its 
failure region may allow the partitions to be refined in a way that leads to the other 
faults in the cluster. However, this does not aid in finding faults in other clusters 
and, in general, failure region analysis offers no confidence that every cluster of 
faults has been located. 

The graphs formed in the Coincidental dimension are more eccentric and sug- 
gest a complex behavior that is more difficult to analyze than the Identical dimen- 
sion graphs. The absence of a clearly evolving structure in the Coincidental graph 
formation may offer insight into when specific testing techniques are appropriate. 
These graphs would seem to provide a basis for Hamlet and Taylor's assertion that 
in the absence of specific information that allows technique refinement (such as 
that provided by the Identical dimension), random testing is as reliable as the best 
planned testing (Hamlet and Taylor, 1988). Further study of the Coincidental di- 
mension is needed to clarify its implications for software testing. 
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C. SUGGESTIONS FOR FURTHER RESEARCH 


1. Experimental Method 


While the results of this work are promising, the experimental population 
was small and narrowly focused. Additionally, the programs were written by 
students. Both the method and the results should be validated using a broad range 
of professionally produced applications. 

One weakness of the method used in this thesis is in how it deals with one 
failure region that is a subset of another. For instance, if region 1 is bounded by 
conditions B and C and region 2 is bounded by A, B, C, D, E and F, the Jaccard 
coefficient is 0.33. However, the relationship is probably stronger than is suggested 
by the coefficient. 

Another weakness is inherent in the use of a coefficient for weighting the 
edges of the graphs. The relationships described by the coefficient are actually 
rational rather than real. The ratios 1/2 and 6/12 both yield the same coefficient. 
However, the second ratio probably represents a more involved (and perhaps 
more easily exploitable) relationship. Both of these difficulties with the coefficient 
suggest the need for a more descriptive representation of the relationship between 
failure regions. The separate distributions of the numerators and the denominators 
were used in an initial attempt to exploit this difference in ratios. However, this 
approach offered no insight and was omitted from this thesis. 

Finally, only threshold graphs were used for cluster analysis. Alternative 
approaches to the problem were described in Chapters ll and lll. These and other 
methods should be explored. Analysis for k-connected components seems 
particularly promising in light of the graphs presented in the previous chapter. 


2. Related Questions 
Several questions about fault and failure region behavior have arisen from 
this study. First, several failure regions were identical in one condition, e.g., the 
reachability condition, but differed in the other two conditions. Can the method 
developed in this study (or some other method) take advantage of this special 
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behavior? What if Condition | for one failure region is identical to Condition II or III 
for other failure regions? 

Somewhat different behaviors were noted among the three versions that 
had approximately the same number of failure regions. Is there an antagonistic 
affect between the Coincidental and Identical dimensions? Does strong clustering 
in one dimension mean weak clustering in the other? 

Faults were numbered in order as they were discovered by the various 
testing techniques of Shimeall and Leveson's study (Shimeall and Leveson, 1991). 
Thus, many sequentially numbered faults were discovered by the same fault- 
detection method. Many sequentially numbered faults were also strongly 
connected in the graphs, often at the same threshold values. Is there an identifiable 
relationship between certain fault-detection techniques and certain types of fault 
clusters? 

An in-depth study of the clusters identified in this study may be useful in 
determining specifically which types of conditions and variables are most likely to 
cause clusters to form. This is an area where comparison of different software 
applications is especially important. Even if clustering is a characteristic of failure 
regions in general, the specific types of conditions that cause the clustering may 
vary from application to application. 

Finally, the understanding of relationships between faults that this thesis 
offers may provide insight into refining existing fault-detection techniques. For 
example, all-paths testing is generally considered to be a desirable goal; however, 
it is usually not achievable because the number of paths is too large. The key 
relationships identified by failure region analysis may provide the information 
necessary to be able to modify such techniques so that they are practical. 


D. APPLICATIONS BEYOND TESTING 


This study has focused on the use of failure regions to understand the relation- 
ship of one fault to another. The goal has been to develop information that will be 
useful in software testing. In a broader sense, however, the relationship between 
two failure regions is a condensation of the relationships between the two sets of 
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code locations that are associated with those failure regions. A set of bounding 
conditions that is analogous to a failure region can be developed for any location 
in a program. If the conclusions of this thesis are applied from this perspective, it 
may lead to a better understanding of how different parts of a program interact with 
each other. Such an understanding might help prevent occurrences like the tele- 
phone example cited in the first chapter by allowing failure prediction. 
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APPENDIX A 


GRAPH THEORY DEFINITIONS 


The following definitions are taken from Buckley and Harary (Buckley 
and Harary, 1990): 


ја 


A graph consists of a finite nonempty set N of nodes together with a 
set E of edges. An edge is an unordered pair of distinct nodes in N. 


. Apath from node u to node v is a sequence of distinct nodes and edg- 


es that starts with u and ends with v. The length of a path is equal to 
the number of edges in the path. 


The distance between nodes u and v is equal to the length of a short- 
est u-v path. 


A graph is connected if there is a path joining each pair of nodes. 


5. A component of a graph is a maximal connected subgraph. 


T. 
8. 
J 


. A cutnode (bridge) of a graph is a node (edge) whose removal in- 


creases the number of components. 
A nonseparable graph is connected, nontrivial, and has no cutnodes. 
A block of a graph is a maximal nonseparable subgraph. 


The eccentricity of a node v is the distance to a node farthest from v. 


10.The radius (diameter) of a graph is the minimum (maximum) eccen- 


tricity of all nodes in the graph. 


11.The center of a graph is the set of all nodes whose eccentricity equals 


the radius of the graph. 


12.The connectivity (edge-connectivity) of a graph is the minimum 


number of nodes (edges) whose removal results in a disconnected or 


trivial graph. 


13.A graph is n-connected (n-edge connected) if its connectivity (edge- 


connectivity) is at least n. 
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APPENDIX B 


VERSION 1 FAILURE REGIONS 


Notation 


In the descriptions that follow, the following conventions are used: 


e 


So far as is possible, the conventions of the specification have been preserved. 


Text appearing in italics (e.g. Endurance") are defined within the scope of this document, 
either globally or for a specific failure region. 


Text appearing in roman type (e.g. ‘Army(].Endurance’) are program variables for the 
implementations containg the fault. The only exception to this is the variable ‘Mainloop’, 
which is used to indicate the current simulation cycle, but may not appear in a specific 
version under that name. 


Due to the fact that program variables are more than one character in length, all multipli- 
cation is shown explicitly with the multiplication symbol x. 


Due to the length of the formulae below, it is necessary to break formulae across more 
than one line. There are no matrix or vector operations appearing in this document, and 
parentheses are used strictly to delimit portions of formulae to improve readability or to 
indicate precedence of operations. 


All defintions within ‘Condition I’ of a failure region are assumed to extend over ‘Condition 
ІГ апа ‘Condition III’ of that failure region unless use of parentheses indicates otherwise. 
All definitions within ‘Condition IT’ of a failure region are similarly assumed to extend over 


‘Condition III’ of that failure region. 
The diacritical marks ^ and " are used strictly to distinguish between variables of similar 


name and role in a given failure region. 
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Predicate Definitions 
Endurance of Squadron (B, g, j) at time t: 


Endurance(B, 9, j,t) - Army[B, g]. Endurance[j]- 
Агту[В, g]. Wear[j] x t— 
Damage(B,g,j,t — 1) + Repair(B,g,j,t — 1) 
Weapon Damage of Squadron(B8,g,j) up to and including time t: 


0 | 
Damage(B,g,j,t ra D 


otherwise 
МАгту|- В) C NumW Types / Army[^B,e)]. Weapon[w].NumWeapon 


2 


еші 


Une 
Damage( Bg t Army[-B, e]. Weapon[w].Damagex 
Army[-B, g]. WeapSensativity[w] x 


(2B,9,3(t — 1) — az B,e,u,i(t — 1))?+ 


кі 





(ув,о (4 - 1) - ауюв сані С ШЕ 


шала Шаһ Army[^B,e]. Weapon|w|. Radius 


\ 
Whether or not Squadron(B,g,7) ìs a casualty at time t: 


Casualty(B,g,j,t) = (Endurance(B,g,j,t —1) > O)A 
(стро < 05) 
Repair applied to Squadron(B,g,j) up to and including time t: 
0 Ше 


Repair(B,g,j,t — 1) it > а 
-Casualty(B, g, j,t — 1) 
Repair(B,g,j,t 1)+ otherwise 
КЕ ЕМ D RE 
FirRate(B,g,t — 1)/NumCas(B, g,t), 
(Army [B, g].Endurance[;] 
— Endurance(B,g, j,t — 1) 
- Кераїт( В, 9,3, - 1) 
+Repair(B,g,j,t — 2))) 


Number of Casualties in Battalion B, g at time t: 


Army[B,g].Squadrons ( 1 1f Casualty( B, g, j,t — 1) 
МитСаѕ(В, 9,1) = 


ј=1 0 otherwise 


Rate of Repair available to any squadron of battalion B, g at time 4: 
FirRate(B,g,t) 2 Army[B, g].FixRate x NumFiz(B,g,t — 1) 


Number of Squadrons in battalion B, g dedicated to repair other squadrons at time t: 


Army[B,g].Squadrons ( 0 if ~Casualty(B, g,j,t) 


| | ТЕЗ 1 otherwise 
NumFiz(B,g,t) 2 Army[B, g].NumFixers x Army[B, g].Squadrons 
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Amount of supplies available in battalion B, g at time t: 


Army[B,g].Squadrons 
Suppl( B, g,t) 2 Army[B, g].FixSuppl — M Repair( B, g, j,t — 1) 
321 


X Location of Battalion B, g at time t: 
rB, g(t) =Army[B, g].X+ 
t 


X (V(B,g,d) x cos(Army[B, g].Theta) 

dl 
xTM(B,g,zB „(d — 1), yB „(d — 1), V(B,g,d — 1)) 
xWM(B,g, zB (d =“ ID YB ,g(d Бе 1), а)) 

Y Location of Battalion B, g at time t: 
yB, (t) =Army[B, g].Y+ 
t 

У (У(В, g,d) x sin(Army[B, g].Theta) 

d=) 
ХТМ(В,9, св (а -- 1), ува(4-- 1), У(ГВ, 9,4 - 1)) 
xW M(B,g, EB а == 1), ув «(а = 1); d)) 


Velocity of Battalion B, g at time t: 


Агту|В ,g).Squadrons | œo if Endurance(B,g,j,t—1)<0 


(B. g ti Р ius 
(В, 0,1) пыр ней х ЕЯ otherwise 
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Terrain effect on Movement of Battalion B, g at location z, y moving at velocity v: 
Let z' and y represent the end of the possible movement, p,q be the Terrain grid location of z, y: 


г! < zr +v x cos(Army[B, g].Theta) 
у = y + v x sin(Army[B, g].Theta) 


D od 


DLE aed 
А ТЕ 


ТМ(В, 4, І, У, о) = Army[B,g].MaxSlope— Attire 2.4(у г! » E x! y! )  Att(p(z).q(y).x, v) 
тах |0, 


х'—1)24+(у'— у)2 + 
0 Army|[B,g].MaxSlope | otherwise 


Weather effect on Movement of Battalion B, g at location z, y at time t: 
Let (W X;, WY;) be the center location of storm : at time t: 


Weather[i]. W X0 if t « Weather[i].TStart V t > Weather[i]. TEnd 
WX; - 
Weather{i].WX0 + (¢ — Weather[z].TStart) x Weather[i].dWX otherwise 


Weather[1]. WYO 1f t « Weather[i].TStart V t » Weather[i]. TEnd 
ИУ. = 
Weather([z].WY0 + (¢ — Weather[i].TStart) x Weather[{i].dWY otherwise 


Let W be the total effect of storms on location (z, y) at time t: 


0 if t « Weather[i].TStart V t > Weather[:]. TEnd 


Params.NumW Events 
W(z, У, і) = »3 Weather[i]. WRadius - J/(r—-W X,)? -(y -W Y,.)? x 
:=1 тах 0, У/УеаспегіїТ. У Вадия otherwise 
Weather[z].W Severity 


1 iV (o, v. bU 


WM(B,g,z,y,t) 2 4 Army[B, g].MWEffect x 
И (су t )— Params. WMaxSeverity x Params.NumW Events 


Params. W MaxSeverity x Params. NumW Events otherwise 
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Weather effect on Observation at location (z, y) at time t: 
0 Ш 0 
WO. t,t) = 


Params. W MaxSeverity x Params.NumW Events 


| W (z,y,t)—Params.W MaxSeverity x Params. NumWEvents otherwise 


(X,Y) Location of Squadron B, g, j at time t: 
Let s be the number of Squadrons in Battalion B, g prior to squadron j that have positive 
endurance at time t: 


1-1 : ; 
TC 0 if Endurance(B,g,i,t-—1) <0 
5(В,9,),1) = > | otherwise 
тв ‹({ — 1) + Агту[В, 9].5аца45ерх 


(=(В, 0,2,0) = ет | X Army[B, g].GRow) = 


Army[B,g]. GRow x Army[B,g].SquadSep 
2 
if s(B,g, Army[B, g].Squadron + 1,t) — s(B,9,j,t) > 
Атту| В, 4).СВом 


78, (2) = TB (t — 1) + Army[B, g].SquadSep x 


(5(В, 9, 2,0 - cem x Army[B, g].GRow) - 


s(B,g,Army[B,g].Squadron +1,t)—- |: = au eae |x Army[B,g]-GRow 


x Army[B, g].SquadSep 
otherwise 


В,4,), 
YB,g,j (t) =yB (t — 1) + Army[B, g].RowSep x Eid 


s( B,g, Armyl| B,g|.Squadron -1,t 
—0.5 x D X Army[B, g].RowSep 
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Squadron B, g, j} observes squadron —B, e, k at time t: 


Observe(B,g,j,e,k,t) = BigEnough(B,g,j,e,k,t) A Clear(B, g,j,e,k, t) 
^Obvious( B, 9g, ), e, k,t) 


Squadron >B, e, k is large enough to be seen at the distance from squadron B, g, j at time t: 


BigEnough(B,g, 3, e, k,t) z 
тд] = тв, (Ї -1)А у97 < ува(Е- ША 
тек = тв (= 1) увек = изне (ена 
пах Гап". => — јап ілі 
(z', y), (z", y^) € ((zek x Army[5B, eJ.5quadWidth/2, 
yek X Army[- B, e]. SquadLength/2)) 


2 Army[B, g].ObsMinAngle[j]) 
No terrain blocks the view of squadron ~B, e, k from the position of squadron B, g, j at time t: 
Clear(B,g,j,e,k,t)= 
19] = 2B 93(t — 1) ЛудЈ = ува @ ША 
tek = rape — 1) Ayek = Yap ene — 1)A 
= g мша =" | жентек мше 
(Уа, а", E с, EN г, em rz е ла = Багш АТОО 


= | на __ |. | UCM 
c= [рес р. = === ^ 


JA 


z = А (а саол чо) МЕ Ае Е YERA 
(Уп,1 < п < Params.SampleRate — 1, 
в р KE QUEE RC EC UU) — | 780 02-0 
(Зг,р,4,7 Ш Params.SampleRate- 1: P ре | Params.XDelta | б | Params.YDelta ) 
(2+7Х (2 —2)) > АЩр, 4, 191] +гх (тей - 191), 991] +гх (уеЁ — уд))) 
)) ) 
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Squadron —B, e, k differs enough from its background to be discerned by squadron B, g, j at time 
t: 


Obvious(B,g9,j,e,k,t)= 
29) - 28,4, - 1)Л уд = ув, (1 — 1)^ 
zek = T Be Wt = 1) A yek = Y-B e k(t - 1)A 


B I(a',c',zek,yek)- Army[AB ,e].SquadIntensity (k 
BlI(a!,c',rek,yek ш 


Params.SampleRate 


po ((wo (ға) и Mainloop) Y 


rams.SampleRate’ Params.SampleRate’? 
n=] 


Army[B, g|.VWEffect )+ 
| 2 
(2-в,е (Маіпіоор -1)--74) х кии) + 


Params.SampleRate 


2 
(v~B,e (Mainloop -1)-%іх pate ) 


Params.SampleRate 






SEI] » Army[-B, e'].ObsJamRadius 











nx(rek—rgj ES 
Params.SampleRate 


2 
> : p та . пх(уек--удј) й 
(у-в,« (Mainloop 1) Ук Params.SampleRate 


Army[7B,e’}].0bsJam Radius 
x Army[7B, e’]|.ObsJamEffect otherwise 
)) <Army[B, g].ObsMinContrast[j]) 


етші 


(с-в,«(Маіпіоор -1)- 29) х 





Squadron ~ В, e, k is in range of the weapons of battalion B, g at time t: 


InRange(B,g,e,k,t) z 
en = fof exit — 1) Ayek = yiper(t —1)A 


(тек — xB,4(t — 1))? + (yek — yp, (t — 1))? < Army[B, g].Weapon[z].Range 


Number of Squadrons in battalion B, g dedicated to processing messages at time t: 


МитСах(В, 9,1) 


NumProcess(B,g,t) = Army[B, g].NumProcess x Army|[B, g].Squadrons 


Number of Squadrons in battalion B, g dedicated to receiving messages at time t: 


NumCas(B, 9,1) 


NumRec(B,g,t) =A B, g|.NumRecei и EE парни 
итКес(В,4,4) rmy[B, g].NumReceive x Army[B, g].Squadrons 


Number of Squadrons in battalion B, g dedicated to communications jamming at time t: 


NumCas(B, g,t) 


NumJ нок о А. B,g|.NumJ о 
итЈат(В, 9,1) rmy|[B, g]. NumJammers ШЕПТЕ 
Number of functional weapons of type 7 in battalion B, g at time t: 


МитСах(В, 9,1) 


; 1 |. № адс 
NumWeapon(B,g,1,t) = Army[B, g].Weapon|z].NumWeapon x Атту|В, g].Squadrons 
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Target coordinates for weapon i of type w in Battalion B, g at time t: 


ATB g,w,i(t) = 22B ek З 


е-1 Агту 2 ; 
) 1 if 3j, Observe( Bg) e k r 
(Р S [B, e'].Squadrons и И 


ие | 
ци Y l if 3j, Observe( B, g, j,e,k',t — 1) Е 
um 0 otherwise Е 


ш-1 
| Dé N umW eapon( B, 9, «oj +:- 1 


ШО = 1 


GUB. s will) ENS 


(> Y ра eJ Squadrons [1 ЗА Cbserve(B ОЕТ J 


0 otherwise 
е'= ке 
А E l if 3j, Observe(B,g, j,e, kt - 1) |y 
= 0 otherwise B 
cg 
| p» NumW eapon(B, g, w', o) +i-1 
w=) 
Command Message m Implemented in Battalion B, g before time t: 
Mimp(B,g,m,t) z 
((Cmsgs[B, m].Time + Army[B, g].MediaDelay 
+ Кесреіау(В, д, есТ(В, д, т)) + QueDelay(B, g,m) 


+Army[B, g].ProcDelay) < t)A 
(Cmsgs[B, m].Dest — g) 


Delay due to message receipt at battalion B, g at time t: 


œ if NumRec( B,g,t) - ComJam(B,g,t) € 0 


ArmylB,g|. RecRate Uh 5 
NumRee(B,g,t)-ComJam(B,g,t) otnerwise 
Number of jammed receivers in battalion B, g at time t: 


Сот/ат(В, 9,1) с 
МАгту|- В| 
p min( NumJam(-^B, e,t), Army[-B, e]. CommJamPriority[9]) x 


ezl 


RecDelay(B,g,t) = 


Army[-B, e].CommJamEffx 


0 Агту[-В,е].Сотт] атВаЧиз—\/(г-в,е(1-1)-хв.9(1-1))2+(у-в,е(1-=1)-ув.9(#-1))2 
ща ) Armyl^B,e].CommJamkRadius 





Delay due to message queuing of command message m in Battalion B, g: 
Duration 1 if CmdSum(B, 9, т, t) gs ReptSum(B, 9, т, t) 


ЕТТЕ » > NumProcess(B,g,t — 1) 


с= Кее Ваа и 


Time command message m is received at battalion B, g: 
Кест(В, д, т) = Cmsgs[B, m].Time + Army[B, g].MediaDelay 
Time delay for report message from battalion B, f to be transmitted to battalion B, g: 


RepT(B,g, f) 2 Army[B, f].SendRate -- Army[B, g].MediaDelay 
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Number of command messages, other than m being processed by battalion B, g at time t: 


0 if (m — n) V (Cmsgs[B, n].Dest # g)V 
(t € RecT(B,g,n)A 


NCmsgs[B] Cmsgs[B, m].Priority > Cmsgs[B, n].Priority) 
Стабит(В, 9, т,і) с % V(Cmsgs[B, n].Time > t)V 
п=1 ( RecT(B, g,n) -- Army[B, g].ProcDelay « t) 


| 1 otherwise 
Some opposing squadron exists and is observed by a squadron of B, g, at time t: 


SomeObserve(B,g,t) = 
(Зе, | <е < Магту[-В], Агту[-В, е].Заца4гопз > 0 A EObserve(B,g,e,t)) 


Some opposing squadron in battalion ~B, e, exists and is observed by some squadron of B, g 
at time t. 


EObserve(B,g,e,t) z 

(3k,1 € k € Army[^B, e].Squadrons, Endurance(-B, e, k,t) » 0^ 
(33,1 € j € Army[B, f].Squadrons, 

Endurance(B, f, j,t) » 0 ^ Observe(B,g, 3, e, k,t))) 


Number of report messages being processed by battalion B, g at time t, while message m may be 
queued: 


0 if (Army[B, f]. Report Z g)V 
(Vt',t — RepT(B,g, f) - Army[B, g].ProcDelay 
< i < і- RepT (B, g, f), 


PB un "E d -SomeObserve(B, f,t'))v 
eptSum(B,g,m,t) = - (SomeObserve( B, f,t — RepT(B,g, f)^ 
ле Army[B, f].Priority < Cmsgs[B, m].Priority) 


1 otherwise 
Battalion B, g 15 active: 


Active(B, g) z((Duration » 0) ^ (Mainloop € {0...Duration})A 
(В € {TRUE, FALSE}) A (NArmy[B] > 0)A 
(g € {1...NArmy[B]}) A (Army[B, g].Squadrons > 0)A 
(22,1 <: < Army[B, g].Squadrons, 
Endurance(B, g, i, Mainloop) » 0)) 


Altitude at position (z, y) in Terrain grid (p, q): 


Terrain[p,q] ^ —'Terrain[p 4 1, q]- 
Alt(p,q,z,y) —| Terrain[p, q + 1]+Terrain[p + 1,q+ 1] хсху| + 
arams.A Deita x Params. Y Delta 


(4 -- 1)(Теггаїп(р, 4) - Теггаіп[р + 1, 4]) 


4(Теггащр,4 + Ц -Теггашщр + 1,4 + 1])– 
хх + 
Params. xX Delta 


(p + 1)(Terrain[p, q]—Terrain[p, q + 1)) 


p(Terrain[p + 1,q] —Terrain[p + 1,4 + 1])— 
ху] + 
arams. elta 


(p+ 1)((q + 1)Terrain[p, g] — qTerrain[p, q + 1])- 
p((q + 1)Terrain[p + 1, q] — qTerrain[p + 1, q+ 1]) 
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Background Intensity at position (xz, у) in Terrain grid (p,q): 





Теггаіп[р, 9 + 1] — Terrain[p, q] - 
Теггаіп[р + 1,9 + 1] – Terrain[p + 1,9] - 
2(Params.X Delta) 

Terrain[p + 1,q + 1] — Terrain[p, g + Ц+ 
Теггаіп[р + 1, q] — Terrain{p, q] 

2(Params.Y Delta) 


x Params.ISlopeFactor4- 
Params.lIAltFactor eee оли 


Params IMeanAlt 
Рагатѕ.іХ x x + Params.IY x y + Params.IC 


Dp ger = 
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Failure Region Definitions 


1.1: Incorrect handling of NumCas when Army.Squadrons=0 initially 
Condition I: 


Duration > 0 ^ (3B, B € (true, false), NArmy[B] » 0 
Condition II: 

(39,1 € g € NÀrmy[B], Army[B, g].Squadrons — 0)) 
Condition III:True 


1.23: Update always implements commands ready at the same time in 
CMsgs array order 


Condition I: 


Active(B, g)^ 

(3m,n,1 € m € NCmsgs[B], 1 € n € NCmsgs[B], m < n^ 
M imp( B, g, m, Mainloop) ^ ^M imp(B, g, m, Mainloop — 1)A 
M imp( B, g, n, Mainloop) ^ ^M mp(B, g,n, Mainloop — 1) 


Condition II: 
Cmsgs[B, m].Priority « Cmsgs[B, n].Priority ^ Cmsgs[B, m].msg 2 Cmsgs[B, m].msg 
Condition III: 


ша ез < NOmsgs[B]; i A mAi X nA 
Mimp(B, g,i, Duration) A M imp(B, g,i, Mainloop — 1))) 


1.3: Over-restrictive check: positive dWX 
Condition I:Params.NumW Events » 0 
Condition II: 


32,1 < : € Params.NumWEvents, Weather[;].dW X « 0 
Condition III:True 


99 


1.4: Over-restrictive check: positive dWY 
Condition I:Params.NumW Events > 0 
Condition II: 


3:, 1 <: < Params.NumWEvents, Weather[7].dWY < 0 
Condition III:True 


1.5: Garbage value in FixSuppl when Fix Supplies exhaused 
Condition I: 


Active( B, g)^ 
(33,1 € j € Army[B, g].Squadrons, Casualty( B, 9, j, Mainloop)) 
Condition II: 


Army[B,g].Squadrons 
у, Кераїт(В, 9, 1, Маїпіоор) | > Агту[В, 9].ЕіхЅиррі 


=! 
Condition III: 
(Ai,1 € i € NCmsgs[B], Mimp(B, 9, i, Duration) A 2M imp(B, g,i, Mainloop — 1)) 


1.6: Spurious input check requiring IAF > 0 
Condition I:True 

Condition II:Params.[AltFactor < 0 

Condition III:True 


1.7: Spurious Input check requiring NumWEvents > 0 
Condition I:True 

Condition II:Params.NumW Events < 0 

Condition III:True 
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1.8: Negative NW value 
Condition I: 


3B,g, e, t, Active(B, g) ^ Active(B, e) ^ 1 < t « MainloopA 
(3j, 5, 1 € k € Army[-B, e].Squadrons A 1 € j € Army[B, g].SquadronsA 
Endurance(B,g, j,t) » 0^ Endurance(-B,e, k,t) » 0^ Observe(B,g, j, e, k, t))^ 
Params.NumW'Types » 1 


Condition II: 


3i, 1 € : € Params.NumW Types, Army[B, g]. WeapPriority[e, 7] « ОМ 
NumW eapon(B, g,2, Mainloop) 
« (МитУИеароп(В, 9, 1, Маїпіоор - 1) - NumWeapon(B, g, i, Mainloop)+ 
Mainloop NArmy[~B] /min(| {k’ 3 4j, Observe(B, g, J, e’, k’, n — 1)} |, 
5 Army([B, g].WeapPriority[e’, i], 
п=1 е! 1 NumW eapon( B, 9, ї, п)) 
NumW eapon(B,g,i,Mainloop—1 






Condition III: 


(Ат, 1 € m € NCmsgs[B], Mimp(B, g, m, Duration) A 2M imp(B, g, m, Mainloop — 1))A 
(Ат, 1 € m € NCmsgs[-B], Mimp(-B, e, m, Duration)A 
М ітр( ЗВ, e, m, Mainloop - 1)) 


1.9: PSentListLoc sends out of range squadron to SquadAlive 
Condition I: 


Active(B, g) ^ Active(B, e) ^ Active(B, f) ^ Army[B, f].Report = g^ 
(3t, 1 € t € Duration, 
і = Mainloop – RepT(B, f, g) — Army[B, g]. ProcDelay 


Army[B ,g].RecRate 
~ NumReec(B,g,Mainloop—Army([B,g].ProcDelay А 
(3k,1 € k € Army[-B, e].Squadrons, (3j, 1 < j < Army[B, f].Squadrons, 
Observe(B, f, j, e, k, t) ^ Endurance(-B,e, k,t) » 0)) 


Condition II: 


(3m,1 € m € NCmsgs[- B], (M imp(-B,e,m,t)) ^ Mimp(-B,e, m, Mainloop)A 
Army[-B, e].Squadrons » Cmsgs[-B , m].msg.Squadrons)) 


Condition III:True 
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1.10: Restriction that SquadIntensity>0 
Condition I: 


( 3B, B € {true, false}, NArmy[B] > 0A 
(39,9 Є (1...NArmy[B]), Army[B, g].Squadrons > 0 
^(3j, j € {1..Army[B, g].Squadrons}, 


сопаП 
Army[B, g].SquadIntensity[j] € 0))) 
Condition III:True 


1.11: Restriction that FixSuppl > 0 
Condition I: 


( 3B, B € (true, false], NArmy[B] » 0^ 
(39,9 € (1...NArmy[B]), Army[B, g].Squadrons » 0 
Condition II: 
Army[B, g].FixSuppl)) 


Condition III:True 


1.12: Segmentation fault when squadron leaves Terrain grid 
Condition I:Active(B,g) 
Condition II: 


(3j,1 € J € Army[B, g].Squadrons, Endurance(B, g, j}, Mainloop) > 0A 
(X5,, ;(Mainloop) « 0 V Xp, ;(Mainloop) » Params.XDelta x MaxerrainV 
YB j; (Mainloop) < 0 V Yg  ;(Mainloop) > Params. YDelta x MaxTerrain)) 


Condition III:True 


1.13: Weapon use functions misordered 
Condition I: 


3B, g,e, Active(B,g) ^ Асії»е( -В, е)л 
(3k,1 € k € Army[-B, e].Squadrons, Endurance(-B, e, k, Mainloop) » 0A 
(33,1 € j € Army[B, g].Squadrons, Observe( B, g, j, e, k, Mainloop — 1))) 


Condition II: True 
Condition III: 


(&' | (3j, Observe(B,g, j, e, k', Mainloop))] 

x (&k" | (3), Observe(B, 9, j, e, k", Mainloop - 1))}л 

(33,1 € j € Army[B, g].Squadrons, Endurance( B, g, j, Duration) » 0)A 
(3k,1 € k € Army[-B,e].Squadrons, Endurance(- B, e, k, Duration) > 0) 
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1.14: Observation list reversed, causes error in firing 
and 


1.15: Unneccesary addition of one to target list subscript in arguments 
to Set LLCoords 


and 


1.16: Unneccesary adding of one to weapon subscript in arguments to 
Set LLCoords 


and 


1.26: Improper targeting due to misordered observation list 
Condition I: 


Active(B,g) A Active(=B, e)A 

(3k,1 € k € Army[-B, e].Squadrons, Endurance(—B, e, k, Duration) » OA 
(37,1 4 ) € Army[B, g].Squadrons, Endurance(B, ӯ, j, Mainloop) » 0^ 
Observe(B, 9g, 3, e, k, Mainloop — 1))) ^ Params.NumWTypes » 1 


Condition II: 

(3k',1 € &' € Army[-B, e].Squadrons, Endurance(-B, e, k', Mainloop — 1) » 0^ 

(33, Observe(B, g, j, e, k', Mainloop — 1))A 

(2-век(Машјоор) БЕ г-векСМашјоор) У у-в е к(Машјоор) # у-в.е к(Маш]оор))) 
Condition III: 

| {k 3 (3), Observe(B, g, j, e, k, Mainloop))] |» 

min(Army[B, g].WeapPriority[e, 1], NumWeapon(B, g, 1, Mainloop))A 

(Army[B, g]. Weapon[1].Damage z Army[B, g].Weapon(2]. DamageV 

Army[-B, e]. WeapSensativity[1] Z Army[-B, e]. WeapSensativity[2])^ 

(Ат, 1 < т < NOmsgs[-H], 

Mimp(-B,e, m, Duration) A 2M imp(-B, e, m, Mainloop)) 
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1.17: Accepts Army.Squadrons=0 as valid data 


Condition I: 
(3B, B € (true, false], NArmy[B] » 0 


Condition II:Army|B, g].Squadrons - 0) 
Condition III:True 


1.18: TerrMoveTM returns unstable value if battalion leaves terrain grid 
Condition I: Active( B, 3) 
Condition II: 


(XBg(Mainloop) < 0V Xg,,(Mainloop) > Params.X Delta x MaxTerrainV 
Yp, (Mainloop) < 0 V Yg (Mainloop) > Params.Y Delta x MaxTerrain) 


Condition III: 


Duration > MainloopA 
(Ai, 1 <: < NCmsgs[B], M imp(B, 9, i, Duration) A 2Mimp(B, g, 1, Mainloop — 1)) 


1.19: NumCas not cleared by command message 
Condition I: 


Active( B, g)^ 
(3i, 1 € i € NCmsgs[B], Mimp(B, g, i, Mainloop) A 2M imp( B, 3,1, Mainloop — 1)) 


Condition II: 
3j,1 € j € Army[B, g].Squadrons, Casualty( B, g, j, Mainloop) 
Condition III:True 
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1.20: NW>0 when KF<O 


Condition I: 


Active(B,g) ANArmy[7B] > 0 A Params.NumW Types > 0л 
(342, 1<2< Params.NumWTypes, 
( Army[B, g]. Weapon[i]. NumWeapon » 0)A 
( Army[B, g]. Weapon[i].UseLimit > ОЈЛ 
( Army[B, g]. Weapon[;].Range » 0)A 
(Зе 1 €e&€NArmy[-B], Army[-B, e].Squadrons » 0A 
(3k, 1€ k € Army[^B, e].Squadrons, 
(3, 1€ j € Army[B, g].Squadrons, 
Endurance(-B e, К, Mainloop) > 0A 
Endurance(B,g, j, Mainloop) » 0A 
Observe(B,g, 3, e, kK, Mainloop — 1) A InRange(B,g, i, e, k, Mainloop) 


Condition II:(Army[B, g]. Weapon[i].FireRate < 0) 
Condition III: 


(Army[B, g]. Weapon[?7].Damage Z 0) ^ (Army[^B, e]. WeaponSensativity[?] > 0)A 
(Duration » Mainloop)A 

(Ar,1«€r € NCmsgs[^B], 

Mimp(-B,e,r, Duration) A 2Mimp(-B,e,r, Mainloop — 1)))) 
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1.22: Report Message processed ahead of command message with equal 
priority, receipt time 
Condition I: 


Active(B, g) ^ Active(^B, e) ^ Active(B, f)^ 
(3i, 1 € i € NCmsgs[B], Мітр(В, 9, 1, Маїпіоор) л - Мітр(В, 9, 1, Mainloop — 1)A 
Armyl[B, f].Report — g ^ (3t, 1 € t € Duration, 
(t 2 Mainloop — Army[B, g].ProcDelay — Army[B, g].MediaDelay 
—Army[B, f].SendRate — 1)A 
(3k,1 € k € Army[-B,e].Squadrons, Endurance(—B, e, k, t) » Ол 
3j, 1 € j € Army|[B, f].Squadrons, Observe(B, f, j, e, k,t))) 


Condition II: Army[B, f].Priority = Cmsgs[B, i].Priority 
Condition III: 
(EX | if Mimp(B,g,m, Mainloop) ^ 2Mimp(B,g9, m, Mainloop — 4 


0 otherwise 
» NumProcess(B, g, Mainloop)) 


m=) 


1.23: Invalid width, height when squadron leaves grid 
Condition I: Active(B, д) 
Condition II: 


(37,1 € j € Army[B, g].Squadrons, Endurance(B, 9, j, Mainloop) » 0A 
(XB, ;j(Mainloop) « 0 V Xp ,,;(Mainloop) » Params.XDelta x MaxTerrainV 
Ув,, ;j(Mainloop) « 0 V Yg,, ;(Mainloop) » Params.YDelta x Max'Terrain)) 


Condition III: 
(Bi,1«€ i € NCmsgs[B], Mimp(B, g, i, Duration) ^ 2M imp(B,g, i, Mainloop — 1)) 


62 


1.25: Observations and Weapon coordinates cleared by command mes- 
sages 
Condition I: 


Active( B, g)^ 
(34,1 € i € NCmsgs[B], Mimp(B, g, i, Mainloop) ^ 2M imp( B, 9,1, Mainloop — 1)) 


Condition II: 


Аспие(- В, е)А 

(Ji, 1<i< Params.NumWTypes, 

( NumW eapon( B, g, 1, Mainloop) » 0^ 
(Army[B, g]. Weaponl[i].FireRate » 0)A 
(Army[B, g]. Weapon[i]. UseLimit > 0)A 
(Army [B, g]. Weapon[i].Range » 0)A 

(ЗЕ, 1 € k € Army[-B, e].Squadrons, 

(3j, 1 € j € Army[B, g].Squadrons, 
Endurance(-B, e, k, Mainloop) » 0A 
Endurance(B, 9, j, Mainloop) » 0A 
Observe( B, g, j, e, k, Mainloop — 1) A InRange(B,9,i, e, k, Mainloop - 1) 


Condition III: 


(Army[B, g]. Weapon[;].Damage Z 0) ^ (Army[^B, e]. WeaponSensativity[;] » 0)A 


(Duration » Mainloop 4- 1) A (^Casualty(—B, e, k, Mainloop))A 
0.5 > Endurance(7~B,e,k,Mainloop)—Damage(7~B,e,k, Mainloop)+Damage(7~B,e,k ,Mainloop—1 A 





Army|^B,e|.Endurance|k 


(-3г, 1 < r € NCmsgs[- B], 
Mimyp(-B,e,r, Duration) A 2 Mimp(-B,e,r, Mainloop — 1)))) 
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1.27: Enemy instead of current position in observation Jamming 
Condition I: 


Active(—^B,e) ^ Active( B, g)^ 
(3k,1 € k < Army[-B, e].Squadrons, 
(37,1 4 ) € Army[B, g].Squadrons, 
Endurance(-B, e, k, Mainloop) » 0^ 
Endurance(B, 9, j, Mainloop) » 0 ^ BigEnough(B,g, j, e, kK, t) ^ Clear(B,g, j, e, k,t) 


Condition II: 


Params.SampleRate > 2A 
(2-8 «(Машіоор) Ж гв,(Маіпіоор)у 
y^B,e(Mainloop) ж ув,(Маіпіоор))л 
29) — zg,,,(t — 1) ^ ygj — un, g,; (t — ЦА 
тек = то Веј: — L)^yek — овен са 
(| Ue ee Ea _. 


Bl(a',c',zek,yek) 
Params.SampleRate 


((WO(zek, yek, Mainloop) x Army[B, g]. VWEffect) + 


n-i 


(z-B,e(Mainloop = 1) es гек)? + 


0 if (ys EM ineo DERE » Army[-B, e'].ObsJamRadius 


МАгту[~ В] 





(т- ве (Маиоор - 1) = rek)? + 
(у-в.е (Mainloop — 1) — yek)? 


Army[7B,e’].ObsJamRadius 
)) <Army[B, 4). СЫМ ЛС) 


Condition III:True 


емші 





x Army[-B, e'].ObsJamEffect otherwise 


1.28: Allocated fixing exceeds NumFixersxFixRate 
Condition I: 


Active(B,g) ^ (3j,1 € j € Army[B, g].Squadrons, Casualty(B, 9, j, Mainloop) 
Condition II: 


( Endurace( B, g, j, Mainloop) — Army[B, g]. Endurance[j]) > 
(Army[B, g].FixRate x NumFiz(B, д, Mainloop) 


Condition III: 


( Endurance( B, 9g, j, Mainloop) 4- Army([B, g].FixRate x NumFiz(B, g, Машоор)) 
< Army[B,g]. ЕлФигалсе A 


(Am,1<m 52 NCmsgs[B], Mimp(B, g, m, Duration) ^ ^ Mimp(B, g, m, Mainloop)) 
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APPENDIX C 
FAILURE REGION-VARIABLE INCIDENCE MATRICES 


The tables in this appendix contain the results of the analysis of the failure re- 
gions of the faults from Shimeall and Leveson’s study (Shimeall and Leveson, 
1991). The failure regions are contained in a technical report (Shimeall, 1991). 

The rows of the table are labeled with program dimensions. The columns are 
labeled with failure region numbers. A plus (+) after a failure region number indi- 
cates that multiple faults had exactly the same failure region; the data for the failure 
region analysis were entered only once in the table. 

The table entries are of the form: | 5. The “I” is an artifact of the initial analysis 
method and is no longer of importance. The number, e.g. 5, gives the lowest num- 
bered failure region to which that failure region is identical in its bounds for the giv- 


en program dimension. 
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ж= = «= == = с» с ee SMe BS SSCS SS SF «ше «ме өше «ле өл» өше өше ev Oe PE um mci b Po wb um cm жы е» =т= єт т=з ЧЕР єз ЖЕ БЫ ЖЕ «що SFOs es ee eee eee eee eee ee eee eee ee eee «ян ет eS Ce ST т- а» -- тн «ы а «ә е» «> «ә а Oe eK Be ewe KT OO eee ee wee eee eee ee fe и = 


=- же -— — — — — е» ж» е» ewe er eee ee ҹа на ар нина мн «- -- «- -- -- жа -- -- “- “- о M «е» че що же що же а» «в «в «в «в е» «в ав «в» «в» «в а» «в «е» «в «в «в «в «в «в «в» «ав «в «в «в «в» «в «в «» «в «в «в «в «в» «е «к «» «в ев ав «ә «- «в «в а» «в а» «в ав «в «в «ә «с «в «ә «в е» «в «с «» «в «в «в а» «в «в «в» «в «но що «но «но «но «но сто «но «но се «но «в «в те те 


| 821 |Мет кесе Ісеритірггір мог У оТ ЕО 


/09/ 

бал 

[бах 

бах 

ОМ 

лваен 
гл/9540 
иоағә/дшпу 


5592014шпуМ 
хашпм 
ашу 
eDueHul 
гойвиприз 


28912 
Ayenseg 
убпоизба 
Ig 


елцоу 
AMP [Heures 
XMP Шешеем 
ецДдА зшева 
eyeqx suieJeg 


-—================================= Hw See 


ojeuejduies sueJeqd 
SodÁ | Aun N' SueJeg 
SIJUSAJ MUNN SWEJIEd 
JOPOL 4 шеле, 
Пебєшом 
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N N 


ПАшаум 

иоцета 

зисарепре'бяцт |)5б5шо 
puesuinw 6sur []s6suJ? 
әмәдәншпр`6$ш`[]565ш2) 


–_—---===––—===—=–—–===–==========================—============—=========—==——-———-—-————--—--------—---------==================================================—=======—==———— Р ә 


55аоојашпм бош [Js swp 
сешшегшпм бзш збзшо 
сехашпм бзш 5бзшоу 
әш! ]`[]585шг2) 

\soq [Js6sw9 


—— —— но» не но но нео со сини ще ще аи ве «не «не "е -- «- че «в че «ә «ы «в «ә» че «в «в «е «в == = ш= шш шш шш шш шш шш шш == one "oou "oe cm «те че «и нә нә ана ӘР ӘР Әр Әр шз «що що сто чо =з по се по п «но пе єт шэ =з шз єз =з =з єт «що «но «но «ив «но «що «но «но «но шз ш» з =з =з єт єт =з =т «що «но «но єт шз =з =з єт =з =з =з єт =з =з == дэ єз =з єт єт =з шз єт o o «що а ӘР «ко «що «но «но «що що «но "o "D oo o m «но «но «но «но» «но «но «що «ко ә o «но «но «но «но m UP V E по ап» «е. 


Аш у [565ш9 
suoJpenbgs []Auuy 
uipiMpenbs [JAuuy 
uiGue Tpenbs []Auuy 
ayeypuas []Auuv 


were ee ew owe wm ew ҥе не == шш шш то но ще че o че що o що що «що — 4 — e «в е» «ы өз жы «ы е» «ғ «ғ «> «ғ жы е» wm amo ue cm cm жэ == == «що «що ФР == =» =з == =з == =» =» єт =з єз єз ж» =» з єт» == =з =з єз =з =з =з =з =з o m «но з =з =з жә «е «в «» «в «» «ы «е «в «» «» «е «в «» «» «е «е «» «е «» «в «в «» «в «в» «е «» «е «в «» «е «» «о «ә е» «ж «е «е «в «в «е «е «в «» «е «» «в «» «» «» «ә «» == == жэ еш «в «в == .- «- «» «е «» «е «ш «е «ы «е е» е» «» «ы е» «» «» «е «» «е «» е- «в «в «ы «ы «о «ы єт == 


поден Ашу 
ÁejegooJq []Auuy 
Aoud [Auuy 
риәсштк Ашу 
олезеншпм ПАшу 


--------- -- “- “ = ч — —— не ят ят ят ят ят не “- ““ ин Р р а р р н р р в ә ә о и ли == == =» =з =з ©» © шэ єє к= me к= к= «- -- «» ана «в «в «> «» що =з =з шә шә =з «» «в «но «но «но «що Gm «но Gm Gm шз =з ш» шш =» == Gm» Gs е» «но «и» «но «но «а «но «що «но «но «но «но «но «но «но» «но «но «що «но «иа. «но «но «но» «ко «но «но «но «но «но «що «но «но «но» «но «ко «но «по «но m KP em umo cmm «и» «що «но eee ee ee ӘӘ ӘӘ Gee eee eee ees m m am Kb VEDO GE ED Gb cm = WED M om жы «> «с өсе» өш еш «в «ы «о «ә «ғ «ы <> «ғ == 


sseooJQunw []Auuy 
сешшегшпм ЦАшу 
чехашпм ПАшу 
кееагреи Аш 
Пезивприз ЦАшу 


mae MA A e GR GR GA GR ue cm Gm «и» «ще «не» «не «но «и» «не «е «но «но «но «не чие «но «е чие «не «и» == == == =» к= == == == == == == = «ә «и» «но» «но» «не «но «но -- == == == == ШӘ == ewe wee eee Ke Kw == «в «» «в ғ «> аа «ә «> ы. нә на == == == == == > «> ға «» .. «> .. «> ғ- .. .. «» «> cm е» «в «> «в «в «> «> «в «в «> «> «> «» е» «> «в «в «» «» е» «в «> «» е» «> «» «» «» «» «> «в «» .. .. «» .. «» «в «> чо чо «що «но «но чо чи «но Gm GR» Gm «но «но Gm «но «но «но Sm «но «що «но ӘӘ «но «но «но «но «но чо чо чи» «но GR Gm GR Gm Gm н ӘӘ ӘР ӘР ә ә ӘӘ ӘӘ 


Г Ге | Ше 


оке | 


8 о 


ее! 


SS 


ez | 


Gaul 
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| | бах 
| | алезаоешос 
| | шпоравн 
| | 1994 
| | 5по/л4О 
| ел/9540 
| одцшпм 
| SSƏ20IJ UNN 
| ашу 
| обивниј 

| әдирприз 
| шпоршо 
| 12919 
| Ayenseg 
| әлцоу 
| | еэал'‘ шее 
| | ецәах`'ѕшееа 
| | ојенојашес'ошејеа 
| | SodÁ | AAUUnN' SuJeJed 
| | Пебєшом 


- — — A — яю чие чи «но чи пе сю «но «ие «но «но «но «тв «ни «но че «но чие «и» чи шш өз чие «ие чие «и» чие чие ие «но «но «и» «но чи че не чи = т= == uso a 4o" ЧЕР == «ще «но d ж» що «но ә: өз ши чи «но «но «що «но «но чие «що «и» «ще - T a e тешз чие «но «но өз ә өс шт өз өз т- өс өз е» т- шт өз «що чие «но (ШР US Gm qe oce cm cm us us cm ose C CO шы «> шы -- жы жы шә ж% өз т- т- ж “ШР w- cm UD cm чие «ще чие «ще «що чие «не «но чие m UR um P V Ge cm «и» que cm Gu) es See e == «и» шр == == р um Әә Gum «но «и» «що m um s um чт Шр == == тз т жы «к че е» «ж е» ж «ғ» == т» 


С ео = Зе x] ore ges ке] бе! се Weal ио!бән 


(репициоо) | LYVd '*XH1VIN 3ON3GIONI 318VIH VA-NOI93H 3HfrTIVJ 2 NOISH3A - (4)20 3198у4 


ГА 


| ПАшмм 
62 | | поцепа 
| зџиозрепрс' бош збзшоу 
| риесшпм бзш 5бзшу 
| злезоецшпм бзш вбзшо) 
| | з880014шпм бзш збзшо) 
| | сешшегшпм бзш обошо 
| | сахадшпм бзш 5бзшу 
| | ош 5вбвшо 
| | iseq []s6su5 
| Ким збзшо 
| зисарепо (ЈАшлу 
| uipiMpenbs []Auuy 
| uiGue 1penbs (ЈАшлу 
| аенриез Аш 
| | родән [Ашу 
| | Áejogo?oJg [Awy 
| | Aoud ЏАшлу 
| | ризвшпм Ашлу 
| | золезецшпм ПАшм 
| | sseooJgunw []Auuy 
| | $сәшшегшпрү []Ашду 
| vii sjexijUnwN' []Auuv 
| | Kejegeipew []Auuy 
| | [Jeoueunpu3 []Auuy 


т” -- --“-- -- -- «е «ә «е «е «ә «ә «ә «г «> «ә «е «» «ә «в же == == т- -- -- --.-...-..-.......................м..д.. ...- «..- C P P f -: p fl Pp PD: p «в еш «- .- «ы «» BP «ә «е «> «» «ә «е «о «ә «е «ы «ы «ы «> «» «ы «» «> «в «о «ә «е «ы «> «ә «в «> «ы «» «в «ы «с «ы «ә «ы «ә «е «о «в «е «в «ә «е «» «о өз «» ав өз өз у= == == =з =з єз ж. «ә «о «о «е «> «в «ә «ы «> к= «- ж че «в т- «в «ә «к өз «ә «ә «в жә «ә «в жт шз тз єє шз =з == «в «ы өз е» жә е» е» «но «в Por 


ПЕ сее 1 ссе | coc | Iced | Odd | вка | вре | гига 9122 | srzi prz | побен 
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тт тт же «шеше чи чи ще Фи що << не же чи що чт що чт не чии чи чие «що чие чии чие чие «ще «но ж» е» - «е ш» - шә == ч == == == == == шю шю чә еще чи ае че тю == =ч == эш Че Че ае Че Чи ще ще чи ж» ще эт =» == чт == =» що эт «в «в че е» «в «в «в «е «е «е «в «в —— - M" -- ж» «> «- «е «в» T —— «е «в» — — — e— a тю == тз == "ШУ тз == В тз == ш» тю чю == Өт Өш єз == — — — ш» =з єт тю =з тю жю т» == т» == =з тт =з — — e —— «ә «в «ә «ж «ә «» «е «» P o Roo om m om m єт тт тт єт == Фф =з =з =з =з 


бах 
олездоешоо 
шпојаен 
1294 
гполао 


е и и ян не че че чи че чт ят ят ят ят Чи “ә “ғ «в «в -- -- -- — чш чш чш чш чш чш — — — — — — — чш - — eww ew em mm шт we ee Ow но wm eee eee em еке «к» ee meme mm oem «но "A oe "Po "o «но «но «» «> «» “к «но ж» «» ж» «> е» гө се р сю пе все п Am Wm ла а = Ae WP Wm Um cm m um "s cm um "m "m "s m "s "m Gm um "m "S Gm "s "m Gm «що um "m m um Gm «ко «що m m "m «що Че m m m "m um m o o че ще че "m че че m "m "m m m m m === 


глі9590 
oeHuunwN 
SSBIOIGWNN 
ашу 
eDueHu| 


ше тш еше өше еее == еше че “ғ еше ел» өш» өм» ew өл» өм» «еч өм» «у= ыз өше өше өш» өш» өл» «що «ше «що «со бы» өше биз «ше «ше өш» «м» дыз өш» жи» еш» е ел» «що» өш» өш» өш» өш» өш» ел» Bee KES SO мн а сю сю се с «но ә» жә «но «и» чие тз єт ж» шт чт єз = кт == чие «не =з єз =з =з єт =т= =з єт =з =з ж» жә че че жт == =з ӨЕ шр == кз ==” ШЖ єз == +з ӨШ) HK ew == == == Ow wee ew =з ewe wee ЧӘР ЧӘР wee ee шт ew чие чи «но чи «но ew eee ew we eee ewe HM eB ee OO "ШР чт SO BS eB ww ewe mem eee ee fF «но «ко «ко те ӘБ 


әоивприз 
шторш) 
веј“) 
Ayenseg 
елцоу 


-= ети чи чии чие = — — A — — — — — — —— = чш ew и о a — ЧЕ —À ewe ee ewe ewe мото си сино «ще жы «с s» жә «и» «що SD Ge өше «ж» б» «ж» чие «но «що «що «що «що «но» то «и» «но» «що «но «но «и» «но «ше «що өш» «ше өш» би» өк» еш» быз «ше би» ел» өше өше «ж» «ле що «що «що «що «що «що «що «но E e VR ни «но um V V т ею ж» ғә ә ж» ж» ы ж» ж» == ж» ы «еч: «ы «в «с өз чт =з == =з = == =з че == =з == == +з о со тз =» m «но «но «но «не «но «но «но «но «но» «то «но «то «но «но «но «но чи «то «но ӘӘ тт =з =з чт тз =з ФЕ эр == чт =з эв =з че те =з «що «но ше т» == че чт == 


| | | | 
9211 | | | 
| | | | 
| | | | 
| | | | 
BETA | | 
| | | | 
| | | | 
9211 I eil eril 
| Seil | | 
| seil | | 
| | | | 
| | | | 
| | | | 
9211 6211 | | 
| | | | 
| | | | 
| | | | 
| seul | | 
9211 | 6:11 өш! 


еэад'$шелеа 
eyegx suieJeg 
әенәјашесб:5шелед 
содА | Мшпм зшееа 


[JSBSWON 


беен 


оссе бес пре = ава | 
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ре о ЕСКЕ ое а] 
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| риәсшпһ бош: ||565ш7) 
| злодоеншпм бзш [ssw 
| ссәооіашпь бом [$6$\/5) 
| оашшегшпи sur []s6suu? 
| сіәхіДшп бош ||666ш”) 
| Пәометриз беш вбзшо) 
| | ош | вбзшо 
| | 590 5б5шо 
| | А Аш 
| | X [Awy 
| | ишпезп Пнодеем Аш 
| | әбиен Пиовеәл Ашу 
| | подеемшпм Пчо9деем Ашу 
| | аенел4 Циодеем Ашу 
| | Doa [Awy 
| | suoJpenbs []Auuy 
| шрилрепоо Ашу 
| шабиәлрепбс [Ашу 
| Лејадооја (Аш 
| puesgunw []Auuv 
| злезеншпм {]Аш у 
| 5590014 шпм []Ашуу 
| соәшшегштм шу 
| сеахашпм [Ашу 
| Kejegeipew []Auuv 
| Пеоивприз ЦАшму 


eee eee eee eee ee ee ewe ы .. е». “- «ы -- «ы -- = н ш ч == ч» "т ш = ӘР ӘР тт ee we eww ew eee ewe ees eee eee eee ee ee ee HK «ы «в к. «в «в «в «в» «в «в ез аһ «в «в өл «в «в» «ә «е «в «в «в «в ев «» «» «ә «в «в «в е» ан «в» е» «в «в «в «в «ы «» «в» «в «в «в «в «в «» «е «в «в» «» « «в «в «в «в «в «в «в «ә «с єт =з =з «е «в «ә «в :::2:2:: 22:2: = = = = == == ========== = 


ИЕ ОЕ све | сре | cieltitel ove lee Pze lge ice Ire lee Ize Iie | побен 
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| | IE CI | | | eil | | | | | | | 
| 981 | ӘСІП! ШІ | | | | | | | | | 
| | ect | | | eil | | | | | | | 
| 9111 | ИОГ | | | | | | | | | | 
| | | в г! | | | | | | | | | | 
| | | | | | | | | | | | | | | 
| | | сец | | | | | | | | | | | | 
| | | | | | КРЕ? | | | | | | 
| | | | | | | | | | | | | | | 
| ШТІ! | 411 | Zi | yen I SAM | | | 
| | | | | | | | | | | | | | 
| | €i! | | | ІІІ 411911 | | 2! | | | 
| | | | | | | eil | | | | | | | 
| | | 2! | | | | | | | | | | | | | 
| | | | | | | eil | | | | | | | 
| | | | | | | | | | | | | | | 
АЫ ЕД| [Шо бл к 17 leil til | | | ЕБІН! 
| ои! | | ЇЕ | | | | | | | | | 
| ЕН | | | 11] | | | | | | | | | | 
| | | | | | | | | | | | | | | 
| | | | | | СКІ A | | | | | | | 
| | l Ви | | IX А | TESTI 
ДИ Desc | | | | (ЖЕГІ | | РИ е: | | | 
4I 21| е | ЕСІН ӨТІ СІ! гето || | sil Си! ЕТГІ 
| ШӘ! | | | | | | | | | | | | 
| su | | | | | | | | 71 | | | | 


ЕЕЕ СЕС ЕСС ере | ке | оре бао се ие ЕС ее Ее | 


(penunuoo») | 1HVd ‘XYLVI JONAGIONI ЭЛЯҮІНҮУЛ-МОГОЗН ЭНПЛІҮ У Є МОІЅНЭЛ - (9) 0 31Я91 


со мо соо си ми си ми см мно им да КИ нано он нао ка ее а nm 


()5пол4О 
()әләѕао 
(всәооашпм 
(јашу 
(јебиени 


(јеомеприз 
()эбешеа 
(eao 
()Аиепзво 
()u6nouaBig 


(елцоу 
еэад‘зшелеа 
ejeqx suieJeg 


хенајшес ошерРа 
SedA | MWNN Swesed 


подета 


Пол беш збзшо 
зџиозрепрс'бош збзшо 


-—— — сю с» сс с «ж» е» чат Фет «но «но «що «но «но «и» чо «но «и» «но «но «и» «и» «не «ще «ие «но «и» «но «но «не KO ара ара жы ара ара е» ара = eee ees е» өш «но «и» ж» «но» «но «но «= «но «но» «но «но «но ра «и» пра ра «и» шз ба «но өм» қа м» що «но «в ж» «и» «но «но» «но «но «и» «но «и» жз «но «що» «що «но «но «но «но «но «но «но «в өз «и» «но a «но «но «но «но «но «но» «що» «но «но «но «но «но «но «що «и» «но «но ра «но «и» «но» = өл» өз ж» «ы = да == да жә «но «но «но — «но «що «но «но «и» «но «що —— ЖӘ өк «но «и» «но «и» «но «що «но «но «що — «но «и» «но «но «но «но тт жт е» ы» «в е» жз «що «но «но == ===== = 
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риесшпм бзш бзшо) 
оледоеншпм sw |збзшо) 
ввеоодшпм бзш |5бзшоу 
сешшегшпм бзш вбзшо) 
сеахашпм бзш збзшоу 
[]Jeoue1npu3 6sur [|]s6su? 


әш! | "Певошо 

1з20 5б5шо 

A` [uuy 

x []fuuv 

ишпезп Пчодвем шу 


мани нити коси мо со со очи аә ав аә а а а о а о ва = ще що що ще ще ще що ‚ш —— = = о e == == аа «в «е «в «в «в «в «в «в «в «но ез «с «ғ «ғ «в «ғ U^ "p qe «» «» ө» ен «ғ «ь е» е» «ь «ғ е» «ғ «ғ «ғ «ғ е VP COUP UR юю сю == Se SOS OS OO ee Oe eee ӘР m c c c RP m o UP UR ШР ШР ж» Te Pee eee ee eee et Be ee ee eee See Te eT eB eee ee eee eee eH ee Fe 


әбиен Пшодеәл Ашу 
подеемшпм шодеәм []Auu v 
ajeyaul4 [Juodeang [Awy 

Пол []Auuy 

зидарепре Ашлу 
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APPENDIX 0 


FAILURE REGION-VARIABLE INCIDENCE MATRIX ANALYSIS 
CODE 


*include <stdio.h> 
#define NUMVARS 70 
#define NUMREG 50 


main () 
{ 
char line[2048]; 
int regid[NUMREG]; 
int similar[NUMVARS] [NUMREG] ; 
int identical[NUMVARS] [NUMREG] ; 
int graph[NUMREG] [NUMREG] [NUMVARS] ; 
іп 111 [МОМВЕС] [МОМВЕС]; 
int $511 [МОМВЕС] [МОМВЕС]; 
int С11[МОМВЕС] [МОМВЕС]; 
int NOO[NUMREG] [NUMREG] ; 
float Ijaccard[NUMREG] [NUMREG] ; 
float Sjaccard [NUMREG] [NUMREG] ; 
float Cjaccard[NUMREG] [NUMREG] ; 
int IOO0[NUMVARS]; 
int SOO[NUMVARS]; 
int COO[NUMVARS]; 
int IjaccardCluster[NUMREG]; 
' int SjaccardCluster [NUMREG] ; 
int CjaccardCluster[NUMRESIS 
float IjaccardValue[NUMREG]; 
float SjaccardValue[NUMREG]; 
float CjaccardValue [NUMREG] ; 
int templ,temp2,temp3; 
float templf, temp2f, temp3f; 
int ln, col, 1, 1, К, Е ; шах те marne 
char status; 


freee ne ra laze arrays ЖЕ ЖАК ха 
for (1=0; i<=NUMVARS; i++) 
{ 


ТО а 05 
$00[1] = 0; 
С00[1] = 0; 


for (j=0; j<=NUMREG; j++) 
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{ 
dene tea hase, | =0; 

} 

} 

О. 


sami lar (i) i )]=0; 


Бог 1<=NUMREG; itt) 
{ 

regid[i]=0; 

wa arde lusteri] 
SjaccardCEuster [i] 
а аесаваСтизеев | ии - 
Еог (350; З<+МОМЕЕС; 


{ 


Z*NUMEE(G; 
2 *NUMREG; 
2 *NUMREG; 
Jt) 


} 
} 
іп-0; 
if 


maxreg = 0; 


(fgets (line, 2048, stdin) ==NULL) ех1 (1); 


/* parse the region numbers corresponding to the columns */ 


ЕСІ-І; /% skip leading tab */ 
i20; 
while (col<strlen(line) ) 
{ 
i++; f=0; 
while (line[col]>='0' && line[col]<= '9') 
{ 
парене СОЛО; COREE 


} 

гедіа [і] =#; 

if (1 > maxreg) maxreg=i; 

ШІ 5Біс rest.of field */ 
while(line[col]!='\t'&& col<strlen(line) ) 
ВОТ; 


СОЛЕ; 


} 

шипом раг5е спе body of the table */ 
while(!feof(stdin)) 
{ 


ЕЕ; 1-1; /* increment var, reset region */ 
if (fgets(line, 2048, stdin)==NULL) break; 
line [strlen(line)-1]='\0'; л К Ыт Хи Его line */ 
со1=0; 
while (col «€ strlen(line)&& line[col]!='\t') 
БЕРЕ /* skip line label */ 
со1++; Етвеш start Of first field */ 
while (col < strlen(line) ) 


{ 


while (line[col]=='\t') 


{ 


/* skip over empty fields */ 
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СО MEN 
} 
if Јаде (сол Ре ции /* if at end of line, get out of Socom 
{ 
Status-line[col]; 
col++; /% атар. 175 Пи 
if «(bine cold =e) 
{ 
Е=0; /* parse for region number after I or § */ 
while (line[col]»-'0' && line[col]<='9') 
{ 
Е = Е * 10 + Irne[cobtle m O ты 
} 
/* store entry in appropriate панк ши 
if (status == 'І!) 
{ 
if(identical[ln][f]l < Е && identical [lnm ји након 
1депЕ1са1 |1п| | хедт9 1) | = identica mmmn 
else identical[ln] [regid[i]]=f; 
if бет таг ЕЛ bel o 
біті 1ағг (Іп) Ігесасіт|!-в таг ЕЕ” 
} 
else if (status == !5") 
{ 
if (identical[iIn] [f]<f ка таепетсат [рей 5 
similar(in] [regid[i) }=identical (ima не 
else similar([ln] [regid[i]]=f; 
} 
else if (status == 'U') /* treat as isolated identical */ 
( 1dentical[ln] |regral|] = теат 
} 
р else if (status == 'U') /* treat as isolated identical */ 
identical[lin] [regid(i]] = теста ш е 
while (line[col]!='\t' && col<strlen(line) ) 
со1++; /% вКір гевс оғ епсіу и 
} 
} 
maxline = Іп; 
} 


/* determine if occurances are identical, similar, Oor созпсєдети ЊЕ 


for (i=2; i<= тахкед; i++ ) /* compare regions pairwise */ 
О 
{ 
for (К=1; К<= maxline; k++) /* for each pair of regions, 
consider each variable */ 
{ 
/* if the variable occurances are identical, graph = 3) */ 
Lr ( identical[k][regid[j]] != 0) 
&& ( identical[k][regid[jl] == identical[k]{regid[i]] ) ) 
акар | геате | гесте ТТТ a= 
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еізе Е В Ее occurances are Similar, graph = 2) */ 


EET ВЕ ам ки | теста 1 || != 0) 
ЕБ (сома Пат ре Тес |] == хгес19 31) ) 
graphiregi dii regi d] 2 ; 

else /* if the variable occurances are coincedental, 


graph = 1) */ 


.-- 


Пе Пе 


ЕН паеръчсае [к | [хеазаГ)]] 

ШЕ еее Гтеатагј11 1 

&& ( identical [{k] [regid{i] ] 
|| Similar[k] [regid{i]] 1- 
сатар тект гш телап 1и [k] 
else /* the variables are not coincedent in this 
Б.Г regions, graph = 0 */ 

graph [{regid[i]] [regid{j]]{k] = 0; 


) 


Ес” Ос” О 


~ 


} 
} 


/* compute proximity indices and coefficients  */ 


for (i=2; i<= maxreg; it+ ) /* compare regions pairwise */ 
EX =]; <= ЕЕ) 
{ 
КОТ (к= k<— maxbine; kt) /* for each pair of regions, 
consider each variable */ 


{ 


аспас тес ея] [К] == 3) 111 [веда [1] ] [тед1а[51]++; / 
* variables іп regions i,j that are identical */ 
Pera раза орні а Је а те] == 2) Cll[regid[i]][regid[j]l]+t+; / 
* variables in regions i,j that are conincedental*/ 
Dem identical[k][([regid[jl] == 0 
БЕ similar[k] [regid[j]] == 0 /* variables that appear in 
&& identical[k] [regid[i]] == 0 ИДЕ лет пестоп i nor j */ 
66 similar[k] [regid[1] ] == 0 ) 


NOO[regid[i]][regid[jl]*-*; 

| 
асса га [гесіа[і] ] [гедіа[5]] 
ИО ( тах11пе 


р sedardig) 7 
100 ве са [1] | | тесда а о); 


ExSccard[regid[il][regid[jl] она аа е | кеолеа и жеста 1 1)) 
р В па ле = моб геста | 111 | теста 111) ); 
} 


f * * kk ck ke ke ee eee ek kk kx x kkkkkkk*keorder «Левова у С. coerficients EU Ay 


templf - 1.0; 
temp2f - 0.9999; 
Б Tent iCal сее калето 45 < ete RRR AA KY 
while(templf >= 0.0) 
{ 
ШӘК (1-2; i<= 21; itt ) /* compare regions pairwise */ 
for (j=1; j<=(i-1); ј++) 
if( (Ijaccard[regid[i]][regid[j]] <= templf) 55 
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(Ijaccard[regid[i]] [regid[]])) > Сеи 
{ 
if(Ili[regid[rj][regrd[3]] == 
IOO[(maxline-NOO[regid[i]][regid[jl])]l-*-*; 
else 
printf ("Ijaccard[%d] [td] = $f , Fraction - са она це 
кеа1а4 [1], кеда1а[]] , Тјассака | кедла [а ] Плеси оне 
111 (кедза [1] ] І гедідаг 311), (пахііпе-м00 | геаїагі ) 1 [сеала ии 
f=NUMREG + 1; /* determine the order in which спин 
for (k=1; k<=NUMREG; k++) /* regions appear v 
if(IjaccardCluster[k] == гесота | 1) Е-К, 
if(f » NUMREG) 
Бог (К=1;К<=МОМВЕС; К++) 
{ 
if(IjaccardCluster[k] > NUMREG) 


{ 


IjaccardCluster[ki) = ЕЕЕ 
IjaccardValue[k] = Ijaccard[regid[i]][regid[j]]; 
} 

if (IjaccardCluster[k] == regid[i]) break; 


} 
f=NUMREG + 1; 
for (k=1; k<=NUMREG; k++) 
if (IjaccardCluster[k] == regid[j]) Е-К; 
if(f » NUMREG) 
for (k=1; k<=NUMREG; k++) 
{ 
if (IjaccardCluster[k] > NUMREG) 


{ 


TjaccardCluster([k]) = кеја ЈЕ 
IjaccardValue[k] = Ijaccard[regid[i]][regid[jll; 
} 

if (IjaccardCluster[k] == regid[j]) break; 


} 
} 
templf = temp2f; 
temp2r -= 0.0001; 


} 
for (i=l;i<=maxline;i++) printf("I00[%d] = за Ха",1,100111); 


for (1=1; 1<=21; i++) 
printf("IjaccardCluster[$d] - $d ,  IjaccardValue[*9dl голи 
i,IjaccardCluster[1],i,I1jaccardwewune Du 


templ-221; 
templf = 1.0; 
temp2f = 0.9999; 


/*******Coincidental coefficients****=*= ы 


while(templf >= 0.0) 


{ 
for (i=2; i<=templ; i++ ) /* compare regions pairwise */ 
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tori (J S (1-1); тт) 
if( (Cgameard[regid[i] 


}{regid({j]] <=  сетрії) 55 
завесата весла а |) | 


[regid[j]] > temp2f) ) 


се ВС МС — sha, fraction — ва / td Wn", 
regddiil],regid[j],Cjaccard[regid[il][regid[jll, 


(2 о EegrcldbsmaxiunesNOO[regscir]]iregidl[jll)); 
f2NUMREG + 1; Из аерешиште се окаев im which the */ 
for (k=1; k<=NUMREG; k++) /* regions appear d 
if(CjaccardCluster[k] == regid[i]) f=k; 


1Е (Е > NUMREG) 
for (k=1; k<=NUMREG; k++) 
{ 
if(CjaccardCluster[k] » NUMREG) 
{ 


CjaccardCluster[k] = regid[il; 
CjaccardValue[k] = Cjaccard[regid[ill][regid[jl]; 
| 

if (CjaccardCluster[k] == regid[i]) break; 


} 
f-NUMREG + 1; 
for (k=1; k<=NUMREG; k++) 
if(CjaccardCluster[k] == regid[j]) f=k; 
if(f > NUMREG) 
for (k=1; k<=NUMREG; k++) 
{ 
if (CjaccardCluster[k] > NUMREG) 
{ 
CjaccardCluster[k] = regid[j]; 
CjaccardValue[k] = Cjaccard{regid[i]] [regid[{j]]; 
} 
if (CjaccardCluster[k] == regid[j]) break; 
} 
} 
templf = temp2f; 
mempzt -= 0.0001; 
} 
or (i-l i= templ; itt) 
e  ассага | шврег са] = га , CjJaccardValue[%d] - 58 Мп", 
Ma e e ardCiusterlil,i,Cjaccardvaluel[i]); 


exit (0); 


} 
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APPENDIX E 
THRESHOLD GRAPH EDGE LISTINGS 


This appendix lists the weights of the edges in the graphs for the four experi- 
mental versions. Only nonzero edges are listed. The weights are presented in two 
forms: a decimal fraction and a ratio of two integers. The ratio represents the exact 
weight; the decimal fractions were used for ordering the magnitudes of the edges. 

I[][]is the coefficient for the Identical dimension. 

C[][]is the coefficient for the Coincidental dimension. 

J[][]is the coefficient for the Composite dimension. 

Also presented is the order in which the nodes were connected in the graph 
and the threshold values at which they were connected, i.e., the value of their larg- 
est weighted incident edge. 

Node[ ] gives the number of the failure region being connected in the graph. 

I[], Cf], andJ[]give the threshold value at which the first edge is added to 


that node. 
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VERSION 1 
ТРЕМТТСАТ 
8-1 [12] = 0.700000 ; 7 / 10 
т] [10] = 0.500000; 1/2 
ШЕ [10] = 0.500000; 1/2 
Meee} (11) = 0.500000 ; 1 / 2 
ШЕ | [9] = 0.466667 ; 7 / 15 
INESNS] - 073353333 ,‚ 1 / 3 
ШОШО 11 - 02533353; 1/3 
ШШШ Т] - 0.333333 ; 1 / 3 
Mme} (lL) = 0.333333 ; 1 / 3 
ШЕ 1231 = 0.333333 ; 5 / 15 
ШЕ (51 = 0.307692 ; 4 / 13 
ieee) (5) = 0.285714 ; 2 / 7 
mes} (12) = 0.250000 ; 3 / 12 
fee) (18) = 0.250000 ; 3 / 12 
ШЕК (5) = 0.230769 ; 3 / 13 
в] [20] = 0.214286 ; 6 / 28 
[5] = 0.181818 ; 2 / 11 
fees) (8) = 0.181818 ; 2 / 11 
ІРО |19) - 0.181818 ; 2 / 11 
МЕЧ |13) - 0.142857 ; 2 / 14 
NEL [12] = 0.133333 ; 2 / 15 
meet i2) = 0.125000 ; 1 / 8 
ШЕГІ ГІЗІ - 0.120000; 3 / 25 
ШЕШІ |8) = 0.117647 ; 2 / 17 
mee) (20) = 0.105263 ; 2 / 19 
В | [8] = 0.103448 ; 3 / 29 
12] = 0.100000; 1/10 
В] [2] = 0.100000; 1/10 
В] [12] = 0.100000 ; 1 / 10 
ШЕР?!) |18) - 0.100000 ; 1 / 10 
ПИО [18] = 0.109000 ; 2 / 20 
В [191 = 0.090909 ; 1 / 11 
ШЕШІ?!) - 0.083933 ; 1 / 12 
ПИ [2] = 0.083333 ; 1 / 12 
ПИО [5] = 0.083333 ; 1 / 12 
Пе 1121 - 0.083333 ; 1 / 12 
ши (131 с 0.083333 ; 1 / 12 
ши [9] = 0.076923 ; 1 / 13 
ШЕГІ |21 - 0.076923 ; 1 / 13 
в [14] = 0.068966 ; 2 / 29 
ШЕРУ [8] = 0.066667 ; 1 / 15 
ВО [13] = 0.066667 ; 1 / 15 
| [19] = 0.066667 ; 1 / 15 
ШЕНІ І8| - 0.066667 ; 1 / 15 
ШЕСІ |18) - 0.066667 ; 1 / 15 
ШЕ [25] = 0.066667 ; 1 / 15 
ШЕЙ І8| - 0.062500 ; 1 / 16 


THESS] 
О в 
Ш 
DEZOT] [9 
Pie lis 

[1 

ial 


(251 


2 
9 
| 
) 
| 


4 ] 
TEZON EZ] 
ЛО] 
I[28] [20] 
ШЕП ТБ | 
ALL OTHER 


Node[1] = 
Node [2] 
Node[3] 
Node [4] 
Node [5] 
Node [6] 
Node[7] 
Node [8] 
Node [9] 
Node [10] 
Node[11] 
Node [12] 
Node [13] 
Node [14] 

] 

| 

) 

) 

) 


|| 


|| 


Моде [15 
моде [16 
Моде [17 
моде [18 
моде [19 
Моае [20] 
Node [21] 
Node [22] 
Node [23] 


= 0.062500 ; 1 / 16 
- 0.058824 ; 1 / 17 
- 0.055556 ; 1 / 18 
0.055556 ; 1 / 18 
- 0.055556 ; 1 / 18 
= 0.055556 ; 1 / 18 
- 0.052632 ; 1 / 19 
- 0.050000 ; 1 / 20 
= 91509 » 1 /.21 
- 0.031250 ; 1 / 32 
EDGES = 0.000000 
23 ,I[1] = 0.700000 
12 ,I[2] = 0.700000 
11 ,ІІЗІ = 0.500000 
10 ,I[4] » 0.500000 
17 ,I[5] = 0.500000 
22 ,I[6] = 0.466667 
9 ,I[7] = 0.466667 
ӨЗГ ЕШ! — 0.335319 
ШОО = б.э не 
eno) = 0.338623 
= 28 ,I[11] = 0.333333 
5 ,I[12] - 0.307692 
ЕИ = 025714 
= 18 ,1[14] = 0.250000 
5] = 0.214286 
= 20 ‚1[16] = 0.214286 
iit?) = 0, 131513 
Зи ие о ив 
О = © 12557 
= 2m T20] = 0.125000 
- 25 ,I[21] “ 0.105263 
cure ЛІ - 0000000 
о = О 000000 


|25) 
ТӘНІ 
eM 
С(9118) 
ОЕ 
[251 [23] 
CSS] 
CTS E рі ому 
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ЕЭШ 11] = 0.285714 ; 7 ШЕ! 92282222 дак 
С[20] [19] = 0.285714 ; 2 / 7 5 = 80211256; Зи 
2 [19] = 0.285714: 2/7 оор = 14236; 3 14 
(Пе [19] = 0.285714; 2/7 CZ 56; 3 14 
ЕМЕ [19] = 0.285714; 2/7 ЕТГІ 22440256; 3 / 14 
СІ25117| е 0.277778 ; 5 / 18 С [4 1] — 052009000 ; 2 / 10 
ШЕГІ (ВІ = 0.277778 ; 5 / 18 СВТ] 0000; 1 /%5 
E 41121 - 0.250000 ; 2 / 8 SPINE SI 89290000. ;. 1 / 5 
5](2] = 0.250000 ; 2 / 8 Сб 860000000; 1/5 
E. plz | = 0.250000 ; 2 / 8 Soe ЕО 000; 1/5 
(71 [2] = 0.250000 ; 2 / 8 СИ ЛБ = 2909000 ; 1 / 5 
ШЕУ | [5] = 0.250000 ; задню e о ТАСУ ВОО s. 1-/ 5 
С[18] [2] = 0.250000 ; 2 / 8 ЕШ ШИ - (2165661; 1/6 
О] [14] = 0.250000; 2/8 ЕЕ = 0. 166567; 1/6 
С [19] [15] = 0.250000 ; 2 / 8 СТАВІВ и 0 153846. 2 13 
С(191(16) - 0.250000 ; 2 / 8 СИИК Ор 13946; 2 / 13 
С[19] [17] = 0.250000; 2 / 8 ТИ) 0 153846; 2 / 13 
ОИЕ = 0.153845 7 2 / 13 


б] 





C[14][7] = 612 152 C[30] [13] =~ БАИ ә 
С({141{8] = ШЕ а е С[30] [19] = O0 mP T ee 
C[15] [5] = ОЕ е i3 С130] [20] ЕРО a 
C[r5]I7] 09 do а 13 С[30] [21] = 0.11112 И НАН 
С1151 181 р E ЩЕ С[30] [22] - 0.18 15221165 УИ 
Cree TS) 0. 595846 ла iS С[30] [23] = 0. 1111 АН 
ФУ ша 0.153346 2 е: С[ 2] [1] = 0.100000 и 
CHES] Qc ESTO а 15 СӘЛ А) 0.100000 ; 1/7 
СТИ 2223246 ; 2 5 C{6](1] = 90.1000007 OTI 
С. 0153886 ; а 85 С[14] [4] = 0.100000 ; 1 „ани 
СЛОВ е О. 253646 2272 LS СО А”) 0.100000 ; 1 ЖБ 
ӨНЕГЕ O TBAG 13 С ОДИ 0.100000 ; І ҒЫ 
СТАВЕ 0.159945. 2 13 СТАИ 0.100000 ; 1 лит 
СТК 0.753946 72 13 C[18] [4] 0.100000 ; 1 / 38 
СІЗГЕ 0.153846 2 1:3 Cao] 0.100000 ; 1 7 
о ОШ ЕВО ВУ E СРО] 0.100000 ; 17 
С[2- 25 02142857 зе УНИ СОКОЛ О 0.100000 ; 1708 
а О 597 9787 ОЛ ы 0.100000 ; 1 7 
СА ИЕ ТӘНЕ Оу) Duy СЕИ 0.100000 1/4 
C RE 0201125357 ІЗ А C[4][2] = 0.090202). іт 
СЕИ 0-007357 7 C[13] [4] 0.090909 ; 1 72M 
eu poss palea 0. 122557 2 / 14 CELIM G] 0.090909 ; 1 7 
СГ 0.142857 2 / 14 СЛ ЛАРЕ 0.090909 ; 1 а 
CAES] OF 142857 27 14 Се] 0.090909 ; I ди 
СӘКЕН 0.142857 204114 (Б ОЛЕНІ 0.090909 ; 10 NI 
а 0. 142657 2 7'14 СЕО ИЕБИ 0.090909; Ша 
СР ВА 0112851 1л сте Ио) 0.090909 ; TOES 
ИНС, 0.142857 Т ЛАТ ӨЛІГІ ЛЕС 0.090909 ; TE 
Cis | О 42357 a СІЗ | 0.090909 ; Си 
ФЕ 0.142857 > we С| 22 | 151 0.090909 ; Я 
СОЛИВ 05204227 ІК Т CI25 0.090909 ; 1 ШЕШ 
CLO MiS] 0.142857 27771 стари и 0.090909 ; TEPEE 
2 Па 6| 0.142857 ЈЕ ӨШЕДІ 227 0.083333 ; Беит 
Clie] 13 999 5 КЕ СО 0.083333 ; TIMES 
ШЕ - 01359957 es СТАРЫН 0.083333 ; Тони 
С [13] [7] - 025825555 2 До (22114) 0.083333 ; І1 ШЕ 
СОЕТ = ЗП ВВ И C(23] [4] = 9.053333; 18а 
C[7] [4] = 0.125000; 2 72 me СІ26І (71 - 0.07114227 С 
Cie] [4] = 0, 1250003; 2022385 С|26) 18) ="0.071429 ; Земи 
С [25116] = 9.125000: 2 EIN С [5] [2] = 0.066667 218 
C{25] [13] Юм C[5] [4] 0.066667 ; І 7458 
С126][1],:= 9120100. la E С [26] [25] = 9.066557; Ж/Ш 
CL30] I1]. 9050 А ОВО Га E С[7] [2] = 0.962509 ; ре 
С[25] [4] = 05 зла гола С[7] [3] = 0.062500; 1108 
С[14} [6] = д ; CIR ED С[8] [2] = 0.062500 ; Б 
С[15] [6] = 9111 „оне СВ] 0.062500 ; 1 Л: 
стој Геј = СЕ И C(191 17] = 0.062500 ; и 
СПТТПТОІ - е” Т E С[ 19] [8] = 0.062500; В EEE: 
С [18] [5] = 0, Пт м С[20] [7] = 0.062500 ; ГА 
C[26][2] = ОИ EE С[20] {8] = 0.952500 ла пое 
Сб ЕЛ = ОИЕ СИОТ 0.062500 ; АГ 
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Node[5] 

Моде [6] 

Моде [7 ] 

Моде [8] 

Моде [9] 

Моае [10] 
Моае [11] 
Node [12] 
Моае [13] 
Моае [14] 
Моае [15] 
Моде [16] 
Моае [17] 
Моае [18] 
Моде [19] 
Моде [20] 
Node [21] 
Node [22] 
Node [23] 
Моае [24] 
Node [25] 


=) 18 
= 20 
= 21 
= 22 
= 23 
ж 14 
= 15 
= 16 


16 
16 
16 
ЕБ 
16 
Ша 
Ша 
Pala 
4. 17 
Дт 
а 
pl 
0 


. 500000 
7200000 
. 500000 
. 500000 
.200000 


- 200000 
.428571 
„428571 
851 
.428571 
28429571 
2355636 
65636 
4522941 
2327941 
552933 
59233 
ще 33333 
и 533 
1230000 
“250000 
1222222 
566567 
.166667 
.000000 


0 №00; 1 / 
ОО; 1 / 
(ОС О 1 / 
МЕШЕЙӘПО ; I 7 
(РАБА ИО А I / 
9-24; 1 / 
0.059924 ; 1 / 
= ©. 058824 ; 1 
- 0058824 ; 1 
= 2955524 ; 1 
- 0.068824 ; 1 
= 0.058824 ; 1 
EDGES = 0.00000 
ШАП соо 
2 (82 0 
[О] = 0 
Z1 8 
, @f5] = 0 
ЕЕ 0 
, СО - 0 
Ла] => 0 
, ко = 0 
ео 10 
ШО 1] =“ 0 
ЗЕК СТАН ПЛ! 
= ер СІМ; - 0 
ен C14) = 0 
==  С115| - 0 
= 1 , С[16] = 0 
2-7]. 519 
NEU, СІ18)| - 0 
СЕЛЕ 50 
= 26 , roo] - 0 
— РОМАН Еј = 0 
= и С/22| - 0 
ВОС [231 =0 
ТЕ С [24| = 0 
jc [25] - 0 
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Ju ZO] 
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з 
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E] 

2114111) 
Е 


DL 67 
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0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
= 0.500000 
0 
0 
0 
0 
0 
0 
0 
0 
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VERSION 


2 


COMPOSITE 


00000 ; 
.714286 
116286 
. 11852259 
14265 
.714286 
.714286 
‚114286 
‚714286 
. 114286 
113296 
0.666667 

. 600000 
. 600000 
. 600000 
. 600000 
-600000 
. 600000 
. 600000 
. 600000 
. 600000 
. 600000 


© со а» Ше» Ше (< (© < < < © 


.500000 
-500000 
.200000 
.500000 
828971 
.428571 
1420511 
2822571 
-428571 
0.332941 


5: 


0. 39595 
0.353935 
0.393953 
0.333383 
0. 359855 
0.939599 
02307692 
05307592 
0307592 


0.300000 ; 


0.285714 
0.289714 


l1 WINE 


; 
, 
; 


f 


и 


. 
f 


/ 


~] 


7 


сл сл пл {л (л (л (л (л (л (л 
ч “чы = 7 ыо = с 


= 


; 47 6 
ма 
713 
„ез 
о 
445 
TAS 
го З 
пе: 
252 
5 2 
; 4 
; 4 
; 4 
; 4 
; 4 

3 
3 
3 
3 
S 


7 


-Ї —1 oo СО со со со їл л (л {л (л (л {л {л сл 


/ 
- 
, 


, 


Ц en SS 


~J ~) ~] 


m 
4 of 2 
; 


е 


зо 
2 


е 
/ 


ӘДИ 
Jp 


[обр ЗІ 
71261119) 
Ј[26] [20] 
Ј[26] [21] 
СІ ЕГЕР 
л ое О 
Ј[7][1] = 
J[81[1] 
J'EN] 
80757 
МИС ЕИ 
Ј(261 (41 
СЗО 


| 
Кус ОРО | 


Пи ои и и ци 
су со Се Се Су Со O O O O O O OOO OIGO OORO со ЫЕ = 


ии и 
Е > су со 


И 


oO QIS 


(C QC) CO C) Lc 


0. 
0. 


0 
42 
22 
22 
22 


Qu 


= 0. 


2659 ТАБЕ 
20271477 
„ПОЗБАВ 


„22000095 
2500007; 
22900005; 


2220000 
.2 50000 
474210101010) 
250000 
250000 
2009 
-2 90000 
7250000 
250000 
-290000 
250000 
2290010 
‚250000 
.250000 
.250000 
2510000 
2250000) 
250000 
, 290000 
„250000 
. 250000 
7290000 
,25 00010 
„250000 
‚250000 
2500008; 
250060 ; 
„250000 
„250000 
„250000 
. 200000 
. 250000 
.250000 
30769 ; 
30059 1 
222227; 
222227 


= 0222227228. 
22022225”; 
214286; 


Фо то 2 м м № Ты БӘРІ м 


№ № № № № № 


/ 7 
/ 78 
/ 
7 18 
/ а 
га 
/ 8 
/ 
ЕЕ 
/ X9 
/ а 


хы ы ч с сс ы ee 
CO CO CO (D COO CO CO COO CO CO GO CO о о со CO CO CO о со со со со що 


с с COO CO сс го 


са Са Са СЦ 


~] N со 


-7 ОО ОЛ О СС «сл сет тек ФО ОЈ ст чело (сз | О 
~) Сл со 
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Cy 
Я ло ~) CO 
~] OA N B is isis is ы л т 


RO Юю МЮ МЮ кусо оне кн ко ка ка Ба кака ка RP ES. E. PrP PoP YN PO 
іі. 7 ШИ] ы 


СЛ СП СЛ сл n Oo O c CD CO -J 
CN E E но T VEU ee 


сс ey сс Су Gg 
"ЕПТЕП 5 C ee исне qp ppm oe) ot га ЗІ СТ (ҒҒ И | те 


Б» нњ: ва ка ок ср ~) со 


со 


Ј[30 
J[30 
uso 
ШЫ. 
E59] [18] 
J 550] [25] 
И 
ШЕ І2!І = 
15) [3] 
zu [6] 
Ј[8] [6] 
ИГ) 
ШІ? | (8) 
ЕГА 
е ГА 
m 5l[ 
455] [ 
СУИ! 
umo] (1] 
J [251 [4] 
Ј[14] [6] 


14 ] 
15] 
16] 
12] 


Еее еее СЗ САС от ir с іле! (жет! іле! Ба ст 


| 
6] 
ШЕ 
| 


|| 


= 0. 


014286 ; 3 / 14 
Qu E286 ; 3 / 14 
ПИРОТА ОТА И 14 
000; 2 7 10 
0S 00900. 5. 3 7 15 
ШӘЙЕШӘ 97 2/11 
шие тв ; 2711 
0.166667 ; 1 / 6 
Е. 1 / 6 
ПЕ, 2 / Із 
вам 2х 13 
(Де сата 2 7 13 
ов ет, 2 7 13 
DR 3542607 2 / 13 
(Део 2 13 
ОШ бон 2 7 13 
0595546 ;, 2 / 13 
ош ве : 2 2 13 
Е 2 / 13 
ОКЕ БИГ 2 / 13 
Е, 2 13 
(шоа ош 2 / l3 
ср о“ 2 / 13 
(Бене ; 2 / 13 
(сео ; 2./ 13 
= 17857 ; 2 / 14 
203289857 ; 2 7714 
ШЕП ЯТ ; 2/14 
2800515225957 1; 2 / 14 
2209142857). 27-14 
= о 142857 ; 17 7 
ШАЯ ; 1 / 7 
И, Пл] 
= СӘТ”; 1 7 
э ШАЛТ 1277 
и asy 272 14 
2058571 / 7 
QE. 2-7 15 
(леве = 2 / 15 
Е, 2 15 
cos 2 15 
В: 2 / 15 
(АШ РОСА 2 7 15 
1-0, 2 / 16 
- 0 1250007, 2 / l6 
2-0: 2 16 
- 0125000 ; 2 / 16 
02000; 18 
ОО; Y 7.8 
nonem 2 4 17 
Den IPC 1-/ 9 
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ГІШ ГӘ | 
oT Peis) 
За ES] 
J [ome] 6] 
еее] 
Jc | 
JO | 
Јо | [3] 
7880 ] [13] 
(30) |19) 
7/60 | [20] 
А ОК 21] 
J OZA 
КОШ | 
«ЛІНІ ГО! = 
РИ = 
ШЕШЕНІ -- 
та | 
(ЭИА) 
ЈОЦА 
ТІГІП!) 
J[18] [4] 
“ШШЕ И] 
ЖЕЛЕРІ 
ӘШІР) 
JN] 
ЈА ЈА 1 ] 
Ј[30] [4] 
теоре! 
#5] [2] 
ЕЕ] 
13] [39 
mE SS] 
раком. | 3) 
2191 |6) 
320] [3] 
QE 
ue rr] 
Ј[21] [6] 
#22! [3] 
2621 [6] 
А ЈУ] 
ШЕГЕ 
Ј[6] [4] = 
Sopa) 
J[20] [4] 
л ыра] 
J [22] [4] 
J[23][4] 
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ЖЕРІ > 
ШЕШЕК. > 
bg .; 
Е; 
S o 
ШЕРІН; 
ВЕ. ; 
ШЕШІ 
Е 
МА | 
ЕО 1 
E11] 
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0150100007; 
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0010000; 
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= 0.090909 ; 
= 0.090909 ; 
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.090909 ; 
090909. ; 
2090909; 
.090909 ; 
.090909 ; 
.090909 ; 
.090909 ; 
Poulos; 
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ВЕ; 
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14 


оо о o o II 
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: 052300 
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. 062600 
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Моае [21] 
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307% 
6, 
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= 14, 
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wa 25 ; 
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Моае [24] = 


Моае [25] 
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Ј[15] = 
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әл ЕЙ 
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15 
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Ша 
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~) ~! ~) 


~) ~} 
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000000 

000000 

1295 
. 114286 
. 114286 
704286 
.714286 
29955057 


256559967 


.600000 
.600000 
. 600000 
. 600000 
. 600000 
.500000 
2592941 
2352941 
2359933 
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2309000 
2300000 
.250000 
1156667 
21556667 
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VERSION 3 
IDENTICAL 
ИВЛ = (7209091 ; 10 / 11 
ШИ | [7] = 0.615385 ; 8 / 13 
№ [1] = 0.500000; з3 / 5 
1 [7] = 0.571429 ; 8 / 14 
8-3] [32] = 0.500000; 2/4 
Я] [32] = 0.500000; 2 / 4 
iso) (32) = 0.500000 ; 2 / 4 
ШЕСІ |32) - 0.500000 ; 2 / 4 
ЦК (52) - 0.500000; 2 / 4 
ШІСТІЗ9) - 0.500000 ; 3 / 6 
81] [39] = 0.500000; 3/6 
№91 1 [40] = 0.500000 ; 3 / 6 
те 2 | [39] = 0.500000; 3/6 
11421140) - 0.500000 ; 3 / 6 
ШЕ? І41)| = 0.500000; 3/6 
81 [7] = 0.470588 ; 8 / 17 


8 | [18] = 0.470588 ; 8 / 17 


5 | [44] = 0.470588 ; 8 / 17 
ШОО [9] = 0.428571; 9/21 
ВЕ] [5] = 0.428571; 3/7 
Е] [26] = 0.428571; 3 / 7 
mETI[33] — 0.400000 ; 2 7 5 
ШЕ [33] = 0.400000; 2/5 
1 [34] = 0.400000; 2/5 
Memeo! {33) = 0.400000 ; 2 / 5 
ШЕСІ (34| = 0.400000 ; 2 / 5 
feel (35) = 0.400000 ; 2 / 5 
ШЕТІ |33) - 0.400000 ; 2 / 5 
ШЕТ | [34] = 0.400000 ; 2 / 5 
Ш [35] = 0.400000 ; 2 / 5 
mee?) [36] = 0.400000 ; 2 / 5 
Meet?) =~ 0.380952 ; 8 / 21 
831 [9] = 0.380952; 8 / 21 
ШЕСІ [9] = 0.380952; 8/21 
ВО] [11] ШЕЗ23333 ; 2 / 6 
ШЕГІ! І24| - 0.333333 ; 2 / 6 
Ш 1126] = 0.333333; зу 9 
ши (24 с О. 333333 ; 2 / 6 
ШЕ 1381 = О.333333 ; 2 / 6 
ШЕ 01 1381 = 0.333333 ; 2 / 6 
ще 1 1381 = 0.333333 ; 2 / 6 
INE [358] 2» 0.333333 ; 2 / 6 
ШЕ (2! - 0,285714 ; 2 / 7 


NES! (3] = 0.272727 ; 3 / 11 
Meili) = 0.250000 ; 2 / 
1 


8 
ieee) (10) = 0.250000 ; / 4 
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ПА = 8 2 / 8 
кые 20000; 1/4 
о, 2 / 8 
ЕЕ 125] 05259990 7 2 и S8 
11401126; - 0.250000 ; 2 / 8 
ЛА АЕ = 12-50009; 2/8 
ШЕШЕ = 062 20000 ; 27 8 
МЕЛ = е Се 2,007 © 
ЕНШІ - 707222222 ; 2 7 9 
I[40][5] + 0.222222 DN NES 
РИТУ = 2222222; 2/9 
Ва] = 0 222222: 2 7 9 
ЕЕ КЕ! - 0.200000»; 1 / 5 
оо] - 0.200000 ; 1 / 5 
ЕТӘ = 9.200000; 217 10 
I fers] [10] 01200000 ; 1 / 5 
ВЕ] = 052000002 1 / 5 
О О = 02800000 ; 1 / 5 
ТМК “е 22200000: 1/7 5 
ИР 0 9000; 1 / 5 
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АРРЕМІХ Ғ 
HISTOGRAMS 


There is one figure in this appendix for each dimension of each experimental 
and random version. Each figure contains two histograms. The first histogram 
shows how many edges are added to the graph in each interval of the Jaccard co- 
efficient. In the second histogram, the column in each Jaccard coefficient interval 
shows how many nodes have their largest incident edge in that interval. 

In the first histogram, the total column height in each interval shows the num- 
ber of edges that have weights in that interval. The column is divided into two parts. 
The black part, labeled "Between Newly Connected Nodes," shows the numbers of 
edges that are incident on nodes that had no incident edge in a higher threshold 
interval. The gray part, labeled "Between Previously Connected Nodes," shows the 
numbers of edges that are incident on nodes that did have an incident edge in a 
higher threshold interval. 

The abscissae of the histograms are labeled with the Jaccard coefficient de- 
creasing from left to right. The histograms are divided into intervals of 0.05. In gen- 
eral, the data included in an interval are strictly less than the upper limit of the in- 
terval and greater than or equal to the lower limit. There are two exceptions: data 
in the uppermost interval are less than or equal to unity; data in the lowermost in- 
terval are strictly greater than zero. 
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Figure F.1 - Version 1, Identical Bounds 
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Figure F.2 - Version 2, Identical Bounds 
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Figure F.3 - Version 3, Identical Bounds 
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Figure F.4 - Version 4, identical Bounds 
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Figure F.5 - Version R20, Identical Bounds 
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Figure F.6 - Version R40, Identical Bounds 
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Figure F.7 - Version 1, Coincidental Bounds 
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Figure F.8 - Version 2, Coincidental Bounds 
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Figure F.9 - Version 3, Coincidental Bounds 


135 








0.05 


Number of Edges 


Number of Nodes 








Between Previously Connected Nodes 


ШІ Between Newly Connected Nodes 


210 possible edges 
193 nonzero edges 


Cx D ee 
oco о о 


О 


ecreasing Modified Jaccard Coefficient (C(m,n)) Threshold 
(a) Edges Added at Each Threshold 


21 nodes total 






= 
о 


0.75 
0.7 


Decreasing Modified Jaccard Coefficient (C(m,n)) Threshold 
(b) Newly Connected Nodes at Each Threshold 


Figure F.10 - Version 4, Coincidental Bounds 
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Figure F.11 - Version R20, Coincidental Bounds 
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Figure F.12 - Version R40, Coincidental Bounds 
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Figure F.13 - Version 1, Composite Bounds 
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Figure F.14 - Version 2, Composite Bounds 
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Figure F.15 - Version 3, Composite Bounds 
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Figure F.16 - Version 4, Composite Bounds 
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Figure F.17 - Version R20, Composite Bounds 
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